-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPIKE - refactoring analysis of initial solution #4
Comments
I reviewed https://github.com/absa-group/living-doc-website-example/tree/main/scripts and I found some existing ideas in https://github.com/absa-group/living-doc-website-example/issues/27 - I like the proposals. Except those, some of mine are below: -- Consider removing some unnecessary code documentation - things like these does not bring any new information # Process issues
issue_list = process_issues(issues, org_name, repo_name) -- Feel free to experiment with OOP. The current code is almost script-like, very procedural, but it's okay for now. -- Be careful about variable scopes. For example, in -- Be careful on these things:
it makes your code less multiplatform - this can't run on Windows because of different path delimiters. Better is to use -- consider using -- consider extracting / changing some data structures to specific custom data classes, for example this one: from dataclasses import dataclass
@dataclass
class Repository:
orgName: str
repoName: str
queryLabels: List[str] also this might be worth considering: also, this can be improved using data classes because you have well-defined structure and you would keep it separated from the processing: unique_projects[project_id] = {
"ID": project_id,
"Number": project_number,
"Title": project_title,
"Owner": org_name,
"RepositoriesFromConfig": [repo_name],
"ProjectRepositories": [],
"Issues": [],
"FieldOptions": sanitized_field_options_dict
} -- this can be done better: def send_graphql_query(...):
...
return {} if len(response) == 0:
return [] like: def send_graphql_query(...):
...
return None # or just `return`, because None is the default if not response:
return and then, if you are working with None, you would need to add one more check for things like this: if not projects: # this one would need to be added
return
for project in projects: -- perhaps some constants can be extracted into |
Also, more about functions vs classes - as I said, your code pretty much doesn't have any OOP. Perhaps it's fine for now, and it's not necessary to do no matter what. This is pretty cool video: https://www.youtube.com/watch?v=o9pEzgHorH0, maybe you will like it. Also some people have quite good points here: https://stackoverflow.com/questions/33072570/when-should-i-be-using-classes-in-python?rq=1 Generally speaking, if you have some processing/functionality that is logically close together and operates on some data, then encapsuling it into classes can lead to much better readability and maintain-ability of your code. But for now, I would be okay with the current approach (since it's a short collection of scripts, each has about 200-300 lines of code anyway) - perhaps just consider to wrapping around some more complex data types into data classes and once this grows, perhaps then it might make sense to revisit this OOP idea :) I can help |
Btw, in terms of additional tooling for checking your code and/or managing environment etc, I recommend using: black, flake8, pylint, pytest, mypy, and potentially poetry This project has some of these things already: https://github.com/AbsaOSS/spline-python-agent/blob/master/pyproject.toml disclaimer: I've never used poetry, because I'm not using Python for around 2 years now, but I used to like pipenv project; but Pipenv seemed to be a bit dead for a while, and I heard really cool things about Poetry, seems to be quite popular these days. |
@lsulak How about using typing.Optional for return value, which can be None? |
Refactoring to use classes to known objects (now defined list) is a handy tip for this project. I would like to see it as one of the main goals of following Refactoring. |
Yes in cases where None can be returned, along with some non-none value, like list or so, https://stackoverflow.com/questions/39429526/how-to-specify-nullable-return-type-with-type-hints def get_some_date(some_argument: int=None) -> Optional[datetime]:
if some_argument is not None and some_argument == 1:
return datetime.utcnow()
else:
return None or, instead of |
This comment is to summarize all SPIKE points for the POC refactoring. Each point is suggested to be a separated refactoring task. This is in my opinion best order to refactor the POC:
from dataclasses import dataclass
@dataclass
class Repository:
orgName: str
repoName: str
queryLabels: List[str]
""" it makes your code less multiplatform - this can't run on Windows because of different path delimiters. Better is to use os.path.join function """
def main():
data = "My data read from the Web"
print(data)
if __name__ == "__main__":
main()
|
I have addressed the mentioned refactoring topic in epic #3 . |
Regarding 3, logging, depends how this will be exactly implemented and deployed - what's the target desired logging storage if any; Watchtower might be helpful: https://pypi.org/project/watchtower/, it directly sends logs to AWS CloudWatch, it's very easy to be plugged into standard Python logger :) |
As we are simple GH Action then the logging is about:
|
I believe we have reached here the point to close this Issue as Solved. |
I am of the same opinion that we reached our goal for this SPIKE. I am closing this Issue as Solved. |
No description provided.
The text was updated successfully, but these errors were encountered: