Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parser for line by line processing #8719

Merged
merged 51 commits into from
Feb 1, 2024
Merged

Add parser for line by line processing #8719

merged 51 commits into from
Feb 1, 2024

Conversation

jdunkerley
Copy link
Member

@jdunkerley jdunkerley commented Jan 9, 2024

Pull Request Description

  • ✅Linting fixes and groups.
  • ✅Add File.from that:Text and use File conversions instead of taking both File and Text and calling File.new.
  • ✅Align Unix Epoc with the UTC timezone and add converting from long value to Date_Time using it.
  • ❌Add simple first logging API allowing writing to log messages from Enso.
  • ✅Fix minor style issue where a test type had a empty constructor.
  • ❌Added a long based array builder.
  • Added File_By_Line to read a file line by line.
  • Added "fast" JSON parser based off Jackson.
  • ✅Altered range to_vector to be a proxy Vector.
  • ✅Added at and get to Database.Column.
  • ✅Added get to Table.Column.
  • ✅Added ability to expand Vector, Array Range, Date_Range to columns.
  • ✅Altered so expand_to_column default column name will be the same as the input column (i.e. no Value suffix).
  • ✅Added ability to expand Map, JS_Object and Jackson_Object to rows with two columns coming out (and extra key column).
  • ✅ Fixed bug where couldn't use integer index to expand to rows.

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed, the GUI was tested when built using ./run ide build.

@jdunkerley jdunkerley force-pushed the wip/jd/edrp-parser branch 3 times, most recently from ab8ec9e to 915b253 Compare January 22, 2024 10:59
@enso-bot enso-bot bot mentioned this pull request Jan 23, 2024
6 tasks
@jdunkerley jdunkerley force-pushed the wip/jd/edrp-parser branch 3 times, most recently from c92bd0a to cad33e9 Compare January 29, 2024 11:32
@jdunkerley jdunkerley marked this pull request as ready for review January 29, 2024 14:15
distribution/lib/Standard/Base/0.0.0-dev/src/Logging.enso Outdated Show resolved Hide resolved
Comment on lines +825 to +827
## PRIVATE
Convert from a Text to a File.
File.from (that:Text) = File.new that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been adding a similar conversion as part of #8809 work, I think it may be worth discussing some future improvements for it as we actually may want to support not only local paths but also e.g. s3:// paths as well, if AWS is imported.

My idea is to do a File_Like.from (that:Text) = ... that would rely on a ServiceProvider logic that allows to register new protocols for resolving files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy for us to end up with that once yours is ready just felt a good change to get rid of all the File.new dotted about and use the conversion point to catch on the way through.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't File a builtin? If so, then it is backed by TruffleFile and not java.io.File. TruffleFile allows any kind of FileSystem including S3, once somebody writes it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I think we were not aware of this. We may consider going in this direction, but right now we are not handling S3 at builtin level, but as a separate type (S3_File), same as our cloud files (Enso_File).

This approach does allow us to do some more special handling for these special files and have better control of what is supported and what is not. For example, the S3_File holds AWS credentials associated to it, in case the user is working with files from different accounts, I don't think that would be possible to implement something like that using the TruffleFile abstraction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, moreover - S3 support is included in a separate library (Standard.AWS) as it should not really be part of Base.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think we should investigate the options of a FileSystem but that is one for a little down the road.

@jdunkerley jdunkerley added the CI: Ready to merge This PR is eligible for automatic merge label Jan 31, 2024
@mergify mergify bot merged commit eeaddbc into develop Feb 1, 2024
26 of 28 checks passed
@mergify mergify bot deleted the wip/jd/edrp-parser branch February 1, 2024 07:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants