Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace hardcoded argument parsing, help text with mappings #13

Merged
merged 4 commits into from
Jan 7, 2021

Conversation

ferdnyc
Copy link
Contributor

@ferdnyc ferdnyc commented Jun 22, 2020

Speaking of things falling out of sync (yuzutech/kroki#279)...

I noticed when running kroki help convert that the list of possible arguments to --type was incomplete. (See bolded text.) Furthermore, the equivalent valid --format arguments weren't listed at all:


$ ./kroki help convert
Convert text diagram to image

Usage:
  kroki convert file [flags]

Flags:
  -c, --config string     alternate config file [env KROKI_CONFIG]
  -f, --format string     output format (default: infer from output file extension
 otherwise svg)
  -h, --help              help for convert
  -o, --out-file string   output file (default: based on path of input file); use - to output
 to STDOUT
  -t, --type string       diagram type [actdiag, blockdiag, c4plantuml, ditaa, dot, erd,
 graphviz, nomnoml, nwdiag, plantuml, seqdiag, svgbob, umlet] (default: infer
 from file extension)

Thus began my journey of discovery.

The end result is that both lists are now displayed, generated semi-automatically (from two slices that list SupportedDiagramTypes and SupportedImageFormats), with each supported kroki.DiagramType or kroki.ImageFormat automatically mapping to its own lowercased name as both the --type / --format argument, and the recognized file extension that corresponds.

The additional "alias" mappings ("jpg": kroki.JPEG, ".dot": kroki.GraphViz, etc.) are hardcoded into the initial mappings. The init() then augments those mappings with the "standard" names taken from the slices..

Those mappings are used in place of all the previously-hardcoded switch statements in cmd/convert.go, and the Supported,,, slices are also used to embed the list of valid arguments into the help text. (The additional, redundant mapping values are not displayed, to avoid clutter, though they will still be recognized.)

So, the help output looks like this now:


./kroki help convert
Convert text diagram to image

Usage:
  kroki convert file [flags]

Flags:
  -c, --config string     alternate config file [env KROKI_CONFIG]
  -f, --format string     output format [base64 jpeg pdf png svg] (default: infer
 from output file extension otherwise svg)
  -h, --help              help for convert
  -o, --out-file string   output file (default: based on path of input file); use - to output
 to STDOUT
  -t, --type string       diagram type [actdiag blockdiag c4plantuml ditaa erd
 graphviz mermaid nomnoml nwdiag plantuml rackdiag seqdiag svgbob umlet
 vega vegalite wavedrom] (default: infer from file extension)

Functionality should be unchanged, all of the same information is there, it's just structured to (hopefully) make maintenance easier longer-term.

If support is subsequently added for any new kroki.DiagramTypes or kroki.ImageFormats, those constants just have to be added to the SupportedDiagramTypes or SupportedImageFormats slices exported from cmd/convert.go and both the matching and the help text will update automatically.

Honestly, the implementation is kind of a hack, and I'm not particularly thrilled with the way it still requires a lot of "manual" mapping construction (even if it's executed programmatically from cmd/convert.go's new init() function). It feels like there should be a better way. Ideally kroki-go would publish this sort of information in some way that it could be queried directly from there, but publishing the names as constants does not lend itself to that type of introspection at all, so I really struggled to come up with a more robust API.

I had to give myself a half-day crash-course on Go just to write this code, and I don't doubt it can be improved upon. Any suggestions for less stupid, ham-fisted ways of structuring the required data for use by the code would be very welcome.

@ggrossetie
Copy link
Member

This is nice, thanks!

If support is subsequently added for any new kroki.DiagramTypes or kroki.ImageFormats, those constants just have to be added to the SupportedDiagramTypes or SupportedImageFormats slices exported from cmd/convert.go and both the matching and the help text will update automatically.

I like where it's going 👍

Honestly, the implementation is kind of a hack, and I'm not particularly thrilled with the way it still requires a lot of "manual" mapping construction (even if it's executed programmatically from cmd/convert.go's new init() function). It feels like there should be a better way. Ideally kroki-go would publish this sort of information in some way that it could be queried directly from there, but publishing the names as constants does not lend itself to that type of introspection at all, so I really struggled to come up with a more robust API.

Indeed.
At some point, it should part of kroki-go, so we don't have to update the mapping in the CLI when we add/remove/update a diagram type or an image format.

I had to give myself a half-day crash-course on Go just to write this code, and I don't doubt it can be improved upon. Any suggestions for less stupid, ham-fisted ways of structuring the required data for use by the code would be very welcome.

I think we should declare two dictionaries, one for the images formats and one for the diagrams types.

ImageFormat:

  • has a unique id
  • is associated to a list of names
  • is associated to a list of file extensions

For instance, ImageFormat.JPEG:

  • id: JPEG
  • names: ["jpg", "jpeg"]
  • fileExtensions: [".jpg", ".jpeg"]

DiagramType:

  • has a unique id
  • is associated to a list of names
  • is associated to a list of file extensions
  • is associated to a list of supported (output) image format

For instance, DiagramType.GraphViz:

  • id: GraphViz
  • names: ["dot", "graphviz"]
  • fileExtensions: [".dot", ".gv", ".graphviz"]
  • supportedImageFormats: [ImageFormat.PNG, ImageFormat.SVG, ImageFormat.JPEG, ImageFormat.PDF]

We can then create maps from the dictionaries depending on what we need. We could even create a tiny "query API" (i.e. get the DiagramType for a given file extension) on top of them.

What do you think?

@ferdnyc
Copy link
Contributor Author

ferdnyc commented Jun 25, 2020

That makes a lot of sense to me, yeah.

DiagramType:

[...]

  • is associated to a list of supported (output) image format

I hadn't even realized that mapping exists, when I put this together — I thought every output format was available for every input format. But from reading the Java service code I see that indeed, not every processor supports every output format. So, that adds an additional wrinkle, and makes the list of formats I added to the help not entirely correct.

If those mappings are available in the code, though, it could support something like this instead:

$ kroki-cli convert --help

-f, --format string     output format (default: infer from output file extension
                        otherwise svg); available formats depend on diagram type
--list-formats          list available output formats for the selected diagram type
                        (requires input filename or --type option)

$ kroki-cli convert -t plantuml -f list
Valid output formats for PlantUML: [base64 jpeg png svg txt utxt]
$ kroki-cli convert -t vega --list-formats
Valid output formats for Vega: [pdf png svg]

(And, of course, it could also validate the requested export format based on the input format.)


On the dictionaries...

For instance, ImageFormat.JPEG:

  • id: JPEG
  • names: ["jpg", "jpeg"]
  • fileExtensions: [".jpg", ".jpeg"]

Part of me thinks that there should be some concept of a "primary" vs. "alternate" or "alias" name(s) for things, to avoid redundancy. If you list the output formats supported, it doesn't make sense to display both jpg and jpeg — though, of course, both should be valid input.

And then file extensions, same thing? If you run kroki convert with -f jpeg or -f jpg, either:

  1. The file name corresponds to your -f argument (either something.jpeg or something.jpg). Which feels a bit odd, though I suppose it does offer the user maximum control/flexibility.
  2. The file is always named something.jpg (or always named something.jpeg) regardless. That seems more predictable, but then how does the code know which extension to prefer? "The first one on the list" feels a bit fragile.

For instance, DiagramType.GraphViz:

  • id: GraphViz
  • names: ["dot", "graphviz"]
  • fileExtensions: [".dot", ".gv", ".graphviz"]
  • supportedImageFormats: [ImageFormat.PNG, ImageFormat.SVG, ImageFormat.JPEG, ImageFormat.PDF]

It gets even trickier with diagram types, because looking at kroki/io/server/Server.java:

  1. Graphviz is a diagramService that's registered as the diagramHandler for type names "graphviz" and "dot", which are interchangeable.
  2. Vega is a diagramService that gets registered as two separate diagramHandler instances. One handles "vega", then a second, separate one with different options will handle "vegalite".
  3. Blockdiag is a diagramService that's registered as a single diagramHandler for all of the following names: "blockdiag", "seqdiag", "actdiag", "nwdiag", "packetdiag", "rackdiag", but it processes each one differently.

So what's the right line to draw?

  • I firmly believe the user shouldn't have to care about the diagramServices and whether the same class processes Vega and VegaLite. "vega" and "vegalite" should be separate diagramTypes.
  • But do we treat "graphviz" and "dot" as separate diagram types? Seems silly when you can easily do this:
    $ ./kroki convert -t graphviz hello.dot
    $ ./kroki convert -t dot hello.dot
    and you'll get the exact same hello.svg output regardless.
  • Blockdiag, OTOH, is completely different. All of its "names" are just aliases to the same diagramService instance, but it does matter which you use because it will process the input differently. Meaning:
    $ ./kroki convert -t rackdiag rack.diag
    $ ./kroki convert -t blockdiag rack.diag
    fail to generate the image {status: 500, body: Error 500: Error: Invalid specification
    

So it seems clear that all of Blockdiag's names have to be separate diagramTypes. Which I guess means Graphviz's should be as well. Just so it's not some weird outlier, it may be simpler if we treat "graphviz" and "dot" as two distinct types even though they're not.

@ggrossetie
Copy link
Member

Part of me thinks that there should be some concept of a "primary" vs. "alternate" or "alias" name(s) for things, to avoid redundancy. If you list the output formats supported, it doesn't make sense to display both jpg and jpeg — though, of course, both should be valid input.

Yes I think we should have a primary name "jpeg" and then alias/alternatives.

And then file extensions, same thing? If you run kroki convert with -f jpeg or -f jpg, either:
The file name corresponds to your -f argument (either something.jpeg or something.jpg). Which feels a bit odd, though I suppose it does offer the user maximum control/flexibility.
The file is always named something.jpg (or always named something.jpeg) regardless. That seems more predictable, but then how does the code know which extension to prefer? "The first one on the list" feels a bit fragile.

Regarding file extensions, we should let the users use any valid file extension associated with an ImageFormat.
For reference, the JPEG ImageFormat has actually 5 valid file extensions according to MDN: .jpg, .jpeg, .jfif, .pjpeg, .pjp.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types

In my opinion, 5 file extensions is excessive and I believe that .jfif, .pjpef and .pjp are very little used. So let's keep .jpeg and .jpg.

It gets even trickier with diagram types, because looking at kroki/io/server/Server.java:
Graphviz is a diagramService that's registered as the diagramHandler for type names "graphviz" and "dot", which are interchangeable.
Vega is a diagramService that gets registered as two separate diagramHandler instances. One handles "vega", then a second, separate one with different options will handle "vegalite".
Blockdiag is a diagramService that's registered as a single diagramHandler for all of the following names: "blockdiag", "seqdiag", "actdiag", "nwdiag", "packetdiag", "rackdiag", but it processes each one differently.
So what's the right line to draw?

vega and vegalite are distinct diagram libraries and the same is true for blockdiag, seqdiag, actdiag, nwdiag, packetdiag and rackdiag.
Conversely, graphviz and dot are alias. GraphViz uses a language called DOT and the dot binary can be used to convert GraphViz diagram. That's the reason why we allow to use both names.
https://graphviz.org/documentation/

I firmly believe the user shouldn't have to care about the diagramServices and whether the same class processes Vega and VegaLite. "vega" and "vegalite" should be separate diagramTypes.

Indeed 👍

But do we treat "graphviz" and "dot" as separate diagram types? Seems silly when you can easily do this:

They are not separated diagram types, just a diagram type with a primary name and an alternative name.

and you'll get the exact same hello.svg output regardless.

Yes that's intended.

So it seems clear that all of Blockdiag's names have to be separate diagramTypes. Which I guess means Graphviz's should be as well. Just so it's not some weird outlier, it may be simpler if we treat "graphviz" and "dot" as two distinct types even though they're not.

I'm not sure I understand why it's bothering you to have more than on name to reference a diagram type? I might be wrong but some users might be more familiar with "dot" than "graphviz" and vice-versa even if they are using the exact same syntax for their diagrams.

@ggrossetie
Copy link
Member

@ferdnyc If you want I can take a stab at it? I might have better suggestions/ideas if I dive into the code 😄

@ggrossetie
Copy link
Member

@ferdnyc I rebased on master and made a few changes, let me know what you think.

I'm quite happy with the result. It brings a few enhancements and removes some code duplication so I would like to move forward with this. In my opinion, we should address the other concerns in subsequent pull requests. We can create an issue to track progress.

cmd/convert.go Outdated Show resolved Hide resolved
@ggrossetie ggrossetie merged commit 5105cb7 into yuzutech:master Jan 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants