Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add impl #9

Draft
wants to merge 39 commits into
base: main
Choose a base branch
from
Draft

add impl #9

wants to merge 39 commits into from

Conversation

zargot
Copy link
Collaborator

@zargot zargot commented May 7, 2024

No description provided.

@zargot zargot requested a review from yulric May 7, 2024 17:53
Copy link
Contributor

@yulric yulric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only took a look at the rules.py file for now since that's the one which has the only completed function (?). Only one comment. I also added docstrings to everything.



class SchemaCtx:
filename: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be filepath? It looks you're setting it to the schema_path argument.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, fixed by using the actual filename

@zargot
Copy link
Collaborator Author

zargot commented May 10, 2024

Only took a look at the rules.py file for now since that's the one which has the only completed function (?). Only one comment. I also added docstrings to everything.

Thanks for the review.

I removed most of the docstrings you added, since they mostly seemed to be AI-generated boilerplate with a lot of confusing info, which also made it much harder to read the actual code. It's also in a private module, which we won't be generating documentation for.

I've retained a more complete docstring on the load function since it's the main function of the module.

When we're adding docstrings to the public modules I'd prefer to make them small and concise, and to use the Sphinx RST format. Sphinx should also be able to infer the types from our type hints.

@yulric
Copy link
Contributor

yulric commented May 10, 2024

Only took a look at the rules.py file for now since that's the one which has the only completed function (?). Only one comment. I also added docstrings to everything.

Thanks for the review.

I removed most of the docstrings you added, since they mostly seemed to be AI-generated boilerplate with a lot of confusing info, which also made it much harder to read the actual code. It's also in a private module, which we won't be generating documentation for.

I disagree re private functions should not have docstrings. They may not matter to a user of the library but as a developer of the library I find them very helpful. Otherwise I'm not sure what the function is meant to be doing.

I've retained a more complete docstring on the load function since it's the main function of the module.

When we're adding docstrings to the public modules I'd prefer to make them small and concise, and to use the Sphinx RST format. Sphinx should also be able to infer the types from our type hints.

Small and concise does not read well IMO. Documentation should be written in human-readable language so that its easier to understand, I find that short/terse sentences puts burden on the reader.

The Sphinx text makes sense.

@zargot
Copy link
Collaborator Author

zargot commented May 10, 2024

I disagree re private functions should not have docstrings. They may not matter to a user of the library but as a developer of the library I find them very helpful. Otherwise I'm not sure what the function is meant to be doing.

You're right that docstrings can be good if they add necessary information, however, the function/parameter names and type hints should already give most of that information IMO, and it becomes very difficult to see the context of the code when every private function starts with 10-20 lines of documentation. Documentation in a public API file, on the other hand, doesn't interfere with the code in the same way.

Small and concise does not read well IMO. Documentation should be written in human-readable language so that its easier to understand, I find that short/terse sentences puts burden on the reader.

I think it's important to keep everything DRY and information dense. The shortest sentence to describe something, while including all the necessary info, would be the best sentence, IMO. In the case of a public API it would be important to explain everything the user needs to know, and more text can be used, since it's meant as more of a specification, but in our private module code I think it's extra important to not repeat ourselves and be concise.

@zargot
Copy link
Collaborator Author

zargot commented May 10, 2024

I'm making sure to add a short docstring to private functions (that need it) from now on, as I can see the value it provides when doing a cold reading, but I'd like to avoid it taking up more than one line/sentence per function.

Copy link
Contributor

@yulric yulric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We should document the type of errors that a function can raise since that's not available in the function signature.


def coerce_value( # type: ignore
ctx: SchemaCtx,
type_class,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should document the type of type_class since it does not have a type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@zargot
Copy link
Collaborator Author

zargot commented May 15, 2024

  1. We should document the type of errors that a function can raise since that's not available in the function signature.

Done. I was only raising a single error type, and it always happened in fail/gen_error, which is why I just wrote "or fails", since failure always meant to raise that error. It also didn't depend on the name of the error, so it could make it easier to maintain. However, I agree that it can be good to see the error type names in the docstrings, and I also forgot to include one other error type. It should be good now.

@zargot zargot marked this pull request as draft May 15, 2024 18:40
@zargot
Copy link
Collaborator Author

zargot commented May 15, 2024

waiting for the rest of the implementation before merge.

@zargot zargot changed the title add cli, api, cons, rules add impl May 15, 2024
having the option to use 'all' for tables would complicate the
implementation since schema parsing becomes dependent on the tables that
exist in the input-data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants