Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: refactor CLI part of excel2xml (DEV-2190) #384

Merged
merged 5 commits into from May 25, 2023

Conversation

jnussbaum
Copy link
Collaborator

resolves DEV-2190

@jnussbaum jnussbaum self-assigned this May 25, 2023
@linear
Copy link

linear bot commented May 25, 2023

Copy link
Collaborator

@BalduinLandolt BalduinLandolt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

One small stylistic thin: I like to be able to "read code from top to bottom" meaning that the public method should be art the beginning, and the private methods after that, ideally in the order they are used in the public method. I definitely wouldn't make this a "hard rule", and there are many edge cases where this cannot be done reasonably anyways. But as a general rule of thumb, how do you feel about this?

Additionally, it came to my mind that the changes you introduced here, should have a big impact on unit testing: Before you could only do an "end to end" test of the huge function; now you have separate units that allow testing much more fine grained and systematically (including all happy and unhappy paths). I don't think this has to be in scope for the current task at hand, but it's definitely something worth considering.

True if everything went well, False otherwise
"""
# read and prepare the input file
success = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed? it's not actually modified

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's modified on (new) line 2244

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... github stopped displaying me the file at line 2243 :)
In that case I would suggest moving it down there, or even change the logic to something like this (pseudo code):

if len(warnings) > 0:
    print("Finished with warnings")
    return False
print("finished...")
return True

or if you want to keep the flow you currently have

with warnings ... as w:
    write(file)
    success: bool = len(w) == 0

print(finished...)
return success

@jnussbaum
Copy link
Collaborator Author

I like to be able to "read code from top to bottom" meaning that the public method should be art the beginning, and the private methods after that

Interesting, I had been thinking about this point, too. But I had always thought that the ideal would be the other way round: Define called functions above their caller. This is already the case in many DSP-TOOLS modules, and perhaps that makes sense for Python, because in "bare" Python code outside a main() function, a function cannot be called if it is defined further down.
It's also interesting what Bing's AI chat has to contribute to this question:
image
I'm especially struck by the last sentence: "Ideally, your modules should be small/short enough such that the ordering of the functions doesn’t really matter." Aouch, here I have some homework to do...

@jnussbaum jnussbaum merged commit cd9cbb7 into main May 25, 2023
11 checks passed
@jnussbaum jnussbaum deleted the wip/dev-2190-refactor-excel2xml branch May 25, 2023 13:52
@daschbot daschbot mentioned this pull request May 25, 2023
@BalduinLandolt
Copy link
Collaborator

I like to be able to "read code from top to bottom" meaning that the public method should be art the beginning, and the private methods after that

Interesting, I had been thinking about this point, too. But I had always thought that the ideal would be the other way round: Define called functions above their caller. This is already the case in many DSP-TOOLS modules, and perhaps that makes sense for Python, because in "bare" Python code outside a main() function, a function cannot be called if it is defined further down. It's also interesting what Bing's AI chat has to contribute to this question: image I'm especially struck by the last sentence: "Ideally, your modules should be small/short enough such that the ordering of the functions doesn’t really matter." Aouch, here I have some homework to do...

Technically, it's not just outside a main() function, it has to be outside of any function at all. And this can never happen for DSP-TOOLS, I think.
But I agree with the AI that it's a matter of taste, so if you prefer it that way, sure.

Regarding the last point, I mostly agree with the AI: modules should definitely be very short! (to me, 200-300 lines max seems sane in many cases, but it's hard to put a number on it. - This, by the way, is also why I'm not a fan of over-documenting code... this is a "Ballance-Akt" to document enough but not too much, otherwise code gets harder to read again.)
But I'm also an advocate of many, short functions, which in turn makes ordering more important. So even for 200 lines of code, if you have short functions, it's very helpful to have a understandable ordering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants