Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support validating JSON piped to stdin #251

Closed
alfreb opened this issue Apr 6, 2023 · 6 comments
Closed

Support validating JSON piped to stdin #251

alfreb opened this issue Apr 6, 2023 · 6 comments
Labels
enhancement New feature or request

Comments

@alfreb
Copy link

alfreb commented Apr 6, 2023

I want to do something like:

$ some_program  | check-jsonschema --schemafile my.schema.json --stdin

To be able to use check-jsonschema to validate JSON output from programs. Often one can add schema validation to the program itself, but in many cases, this is

  • impractical, e.g., we don't have a good schema validation library for the implementation language.
  • impossible, e.g., we don't have the program source code, but just a third-party binary
  • non-performant, e.g., we want to write the source code to implicitly output correct JSON without having to post-process, but we want to validate as a part of our test pipeline.

Are you open to a pull request with this feature?

@sirosen
Copy link
Member

sirosen commented Apr 11, 2023

I started to take a look at how hard this would be last week and didn't get to writing a reply; sorry about that.
I would definitely be open to a PR for this, but I think that the right interface would be to allow - to refer to stdin, not a separate flag.

That would make for usage as follows:

some_prog | check-jsonschema --schemafile foo.json -

I think it's a bit messier than you might expect to get this into the codebase. Ideally, from a maintenance standpoint, the instance files would be parsed using click.File("r"), which would handle taking filenames and - automatically. But then the filenames are expected to be paths in several locations, which needs to be adjusted as an expectation.

I may also start working on this myself at some point. I think it would be a good feature.

@sirosen sirosen added the enhancement New feature or request label Apr 11, 2023
@sloanlance
Copy link

sloanlance commented Sep 23, 2023

Ideally, from a maintenance standpoint, the instance files would be parsed using click.File("r"), which would handle taking filenames and - automatically. But then the filenames are expected to be paths in several locations, which needs to be adjusted as an expectation.

I may also start working on this myself at some point. I think it would be a good feature.

I've started working on this in a fork. (In fact, almost done!) I didn't read your comment before I began working, but coincidentally, I happened to do it as you suggested, using click.File("r").

However, a simpler implementation would be to handle the list of file names, adding /dev/stdin if the list is empty or replacing occurrences of - with that. Would you be interested in that approach instead?

@sirosen
Copy link
Member

sirosen commented Sep 23, 2023

That's great! Many, many thanks for looking into this! I was doing some tinkering on this front as well, after the other ticket, so I just committed what I had to be able to put it in a branch to share -- in case it's useful: https://github.com/python-jsonschema/check-jsonschema/compare/support-stdin

However, a simpler implementation would be to handle the list of file names, adding /dev/stdin if the list is empty or replacing occurrences of - with that. Would you be interested in that approach instead?

I would prefer to use click.File, but I could be convinced otherwise.
It nicely encapsulates this logic for us, making non-stdin and stdin cases look pretty similar (we get back an open stream rather than a string).

So even though it's more work to accommodate it today, I think it will be more maintainable in the long term.

If the behavior is enough trouble that it's the difference between you being able to submit a PR or not though, I'd rather have the PR than click.File! 😂

sloanlance added a commit to sloanlance/check-jsonschema that referenced this issue Sep 25, 2023
Works through steps of argument parsing and accessing open files.  TODO:  Update schema parser arguments.
@sirosen
Copy link
Member

sirosen commented Sep 28, 2023

I decided to knock this out because I thought the branch I had was most of the way there other than having to pull in a bunch of test fixes. I considered holding off a bit longer -- I am glad and thankful for the interest in contributing a change, and I hope this experience isn't discouraging! -- but concluded that I have the time and attention for this now and I shouldn't waste the opportunity.

This was handled in #332 . I'll pull together a final changelog and then expect this in a release!

@sirosen sirosen closed this as completed Sep 28, 2023
sloanlance added a commit to sloanlance/check-jsonschema that referenced this issue Sep 30, 2023
@sloanlance
Copy link

Oh, you went ahead and completed it? I didn't check the issue here before I continued working on it. 😆

Well, I'll see what you ended up with. I compared my branch to yours. They were very similar, but I had an extra feature that I think would be very helpful. I also had some questions about why some path function arguments were declared as path objects or string in a number of places when the path values seemed to always be string.

@sirosen
Copy link
Member

sirosen commented Oct 1, 2023

Sorry about that! I had the work mostly done except for some test fixes, and then I put it on ice for a few days to give time and space for a contribution. But it was an awkward situation (one branch from the maintainer, one from a new contributor, both on the same feature) and I made the call that spending another hour and releasing it was best. Maybe a mistake, but these things are hard to judge.

I had an extra feature that I think would be very helpful.

I'd be more than happy to take a separate feature request or PR!

I also had some questions about why some path function arguments were declared as path objects or string in a number of places when the path values seemed to always be string.

The simplest answer is that there were some test cases (not sure if any still exist) which pass paths built from pytest fixtures.

But also I happen to think that Path | str is a good interface for things which handle paths. If you write your infrastructure of helpers that way, they'll all be mutually compatible and your main application logic can choose whichever representation is appropriate.

So in inclined to keep things this way, although I am always open to having my mind changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants