Skip to content

Add diagnostics (and some other improvements) #3223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 20, 2023
Merged

Conversation

jachris
Copy link
Contributor

@jachris jachris commented Jun 14, 2023

Note: This PR is quite big (because a lot of the changes were interdependant).

image

Changelog

  • Added tenzir -f <file> to load and execute a file.
  • Added struct diagnostic { ... } and struct location { ... } to emit diagnostics (errors, warnings, notes) during parsing and execution (with more diagnostics to come in the future).
  • Added a recursive-descent parser that, in contrast to the pure combinator-based approach, is able emit helpful diagnostics.
  • Added a colorful diagnostic printer which prints annotated snippets of the relevant code.
  • Added an operator argument parser to facilitate operator implementations, has diagnostics, usage and docs links
  • Added a parse_operator(...) function to operator plugins (and similar for loaders/parsers/printers/saves), with a deprecated fallback to make_operator(...).
  • Changed serialization to use inspect(op) instead of to_string(op). This not only solves some incomplete escaping attempts, but also allows transporting metadata (such as location) and eliminates the need for workarounds for serialization of export | where ... and other optimizations.
  • Added separate plugins for parsing and serialization. This is also because the plugin (implicitly) provides the inspect implementation, but not all plugins define operators (some resolve to already existing operator implementations). Inheriting from operator_plugin implicitly adds both.
  • Improved loader/parser/printer/saver plugin API (e.g. using a printer with invalid arguments will now error when parsing the pipeline, instead of during execution or sometimes even not at all).
  • Changed the main code path to not use language plugins anymore. They can be replaced with better solutions soon.
  • Changed the default connector to stdin/stdout for all formats (e.g. to unify write json and write csv).

Future work

  • The recursive descent parser is pretty experimental and will probably see quite some improvements in the future. In particular, this applies to the built-in dynamic lexer and the parser interface, which were just built to get the job done. This absolutely needs some more work, but we might want to merge this anyways to already get everything mentioned above and avoid merge conflicts.
  • The diagnostic printer already supports disabling colors, but we need some detection logic with manual overrides.
  • The diagnostic printer works with files and command-line strings, but the output format was built for files. We should consider adding an alternative diagnostic printer if the input was given as a one-line string.
  • Make use the diagnostics in more places. In particular, I did not convert many places yet where warnings/errors could be emitted during execution.
  • Convert remaining operators while expanding the parser as needed.
  • Currently, the parser makes use of exceptions (which get caught at an outer layer). This becomes problematic once exceptions cross our plugin API boundaries, that is: parse_operator(...) can currently throw a diagnostic. I don't really like this. Furthermore, as it is right now, plugin parsers cannot emit warnings. At least this should be changed.
  • This still contains some definitions and a parser for the potential upgrade of expression. They are more or less dead code and are to be upgraded as part of the expression revamp, or eventually deleted if they stay unused.
  • This PR adds a default implementation for operator_base::to_string() that uses caf::detail::stringification_inspector. This output is not too readable, and in particular unsuitable for user-facing messages. We should consider a json_writer that uses to_string(...) for some types, such as expressions.
  • Try to find a solution for the problem that arises from the implicit plugin::name() overrides in connection with defining both loader and saver (or parser and printer).

@jachris jachris added feature New functionality enhancement labels Jun 14, 2023
@jachris jachris force-pushed the topic/diagnostics-et-al branch from c75736d to 1d63455 Compare June 14, 2023 12:02
@jachris jachris changed the title Diagnostics (and some other improvements) Add diagnostics (and some other improvements) Jun 14, 2023
@jachris jachris force-pushed the topic/diagnostics-et-al branch 11 times, most recently from 4a8df5c to 64c0d99 Compare June 15, 2023 16:12
@jachris jachris force-pushed the topic/diagnostics-et-al branch 3 times, most recently from 0143b7c to 6ab329e Compare June 16, 2023 09:05
Copy link
Member

@dominiklohmann dominiklohmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fantastic piece of work. This had a three-part review that's not directly on GitHub as that became impractical due to the size of the PR.

We have a small list of follow-ups that we need to tackle. I fully trust that @jachris implements them before merging, so I can already greenlight this as-is.

We should avoid merging other PRs until this is in so we can avoid further merge conflicts.

@tobim
Copy link
Member

tobim commented Jun 16, 2023

We should avoid merging other PRs until this is in so we can avoid further merge conflicts.

We need to merge #3232 for CI. It would also be great to not delay #3225, @lava is going to check how much both branches conflict.

@jachris
Copy link
Contributor Author

jachris commented Jun 16, 2023

We need to merge #3232 for CI. It would also be great to not delay #3225, @lava is going to check how much both branches conflict.

I will enable auto merge here any second now. But CI might fail also for actual reasons, so go ahead if you want to.

@lava
Copy link
Member

lava commented Jun 16, 2023

@lava is going to check how much both branches conflict.

There's two conflicts, one in fwd.hpp and one in pipeline_manager.cpp, both are just 1-2 lines and straightforward to resolve.

@jachris jachris force-pushed the topic/diagnostics-et-al branch from a0ded22 to 39bb243 Compare June 16, 2023 15:11
@jachris jachris marked this pull request as ready for review June 16, 2023 15:11
@jachris jachris enabled auto-merge June 16, 2023 15:12
@jachris jachris force-pushed the topic/diagnostics-et-al branch 6 times, most recently from 94674d3 to 1420765 Compare June 19, 2023 13:21
@jachris jachris force-pushed the topic/diagnostics-et-al branch from 1420765 to 0bc352f Compare June 19, 2023 13:31
@jachris jachris merged commit 293cb66 into main Jun 20, 2023
@jachris jachris deleted the topic/diagnostics-et-al branch June 20, 2023 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants