-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DH Code Review #1
base: empty
Are you sure you want to change the base?
Conversation
update run_pydistinto_beginners.py and output folder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- General remarks on
scripts/pipeline/calculate_simple.py
- The file is relatively long (>250 LOC)
- Most methods have many positional arguments (more than 4) which make it hard to read
- Pandas Dataframes could be used to reduce complexity
- Docstrings should be more concise
- Parameters should not be passed as concatinated strings
- There is some code duplication (welsh_t, LLR, chi_squared, ranksumtest)
- Use automated tools to improve your code https://github.com/maehr/pydistinto/pull/1/files#diff-9a65b3c578318adb9006f0a948af12ef7d261f8bf00dea5003adeac29f79a5b7
- Use a code formatter like black, yapf or autopep to get more consistency
README.md
is missing important pieces: (Also check https://www.makeareadme.com/)- Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
- Example usage and data: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems). Do the authors include example data to be used to test the software?
- Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
LICENSE.md
is missing.- Functionality is hard to judge because of unconventional API (i.e.
parameters.txt
) and missing documentation- Installation: Does installation proceed as outlined in the documentation?
- Functionality: Have the functional claims of the software been confirmed?
- Nice to Have things that are missing
- Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
- Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support?
@@ -0,0 +1,371 @@ | |||
# -*- coding: utf-8 -*- | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filename does not indicate the functionality and there is very little information in the docstring. Checkout https://realpython.com/documenting-python-code/#package-and-module-docstrings
|
||
@author: KeliDu | ||
""" | ||
# ================================= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code is self-explanatory at this point. You can remove this comment.
|
||
@author: KeliDu | ||
""" | ||
# ================================= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code is self-explanatory at this point. You can remove this comment.
# ================================= | ||
|
||
import os | ||
import re |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove unused import
# ================================= | ||
|
||
import os | ||
import re |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove unused import
tf_frame1 = tf_frame.T.filter(regex=ids1, axis=1) | ||
tf_frame2 = tf_frame.T.filter(regex=ids2, axis=1) | ||
print("\nbinary1\n", binary1.head()) | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this comment for?
""" | ||
# Define logaddition and division-by-zero avoidance addition | ||
logaddition = logaddition+1 | ||
divaddition = 0.00000000001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very unorthodox and not properly documented
""" | ||
# Define logaddition and division-by-zero avoidance addition | ||
logaddition = logaddition+1 | ||
divaddition = 0.00000000001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very unorthodox and not properly documented
return devprops | ||
|
||
|
||
def calculate_scores(docprops1, docprops2, absolute1, absolute2, relfreqs1, relfreqs2, logaddition, segmentlength, idlists, tf_framefreqs1, tf_framefreqs2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
many positional arguments
return devprops | ||
|
||
|
||
def calculate_scores(docprops1, docprops2, absolute1, absolute2, relfreqs1, relfreqs2, logaddition, segmentlength, idlists, tf_framefreqs1, tf_framefreqs2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
many positional arguments
@@ -0,0 +1,371 @@ | |||
# -*- coding: utf-8 -*- | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filename does not indicate the functionality and there is very little information in the docstring. Checkout https://realpython.com/documenting-python-code/#package-and-module-docstrings
@maehr this is great, thanks so much! What do you think the author should definitely address to get a code review completed badge? |
Most importantly documentation. But I recommend refactoring and simplifying of the code base to make it easier to understand (and maintain). |
The ticket for this code review is: DHCodeReview/DHCodeReview#3
This repository is ready for review.