Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Utilities] Proposal, time.py, format.py #44

Merged
merged 7 commits into from
Jul 14, 2021
Merged

Conversation

Rubix982
Copy link
Contributor

@Rubix982 Rubix982 commented Jul 9, 2021

[Utilities] Proposal, time.py, format.py

Pull Request Type

  • 🏆 Enhancements - adds 5 functions as utils

Purpose

Implementing some utilities

Why?

time.py

Time utility to convert date format into a UNIX timestamp.

The code to me felt redundant and the quality can be approved, but it works as expected for now, and can manage 6 + 1 formats of the date, making it flexible, starting from Year to Second (not ms).

format.py

Functionality for converting some data types into mainly GeoJSON, for merging GeoJSON

This component helps to convert data structure formats. I also added functionality for merging GeoJSONS when building req#2 thinking it might need multiple endpoints - which it didn't in the end, but it is helpful if want to 'join' GeoJSON based on some key in the features.

Feedback required over

  • A quick pair of 👀 on the code
  • Discussion on the technical approach

Feedback required when

Whenever feasible

Mentions

The code to me feels redundant and the quality can be approved, but it works as expected for now, and can manage a 6 + 1 formats of the date, making it flexible, starting from Year to Second (not ms)
…N, for merging GeoJSON

This component helps to convert data structure formats. I also added a functionality for mergin GeoJSONS when building req#2 thinking it might need multiple end points - which it didn't in the end, but nevertheless, it is helpful if want to 'join' GeoJSON based on some key in the features
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 9, 2021
Copy link
Contributor

@gmelodie gmelodie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code seems to rewrite a lot of features that are already present in Python itself. Always look for native Python solutions for your problems before implementing your own. Also, if you think a function needs refactoring, do that yourself before submitting the PR. Prioritize quality code over volume

# YYYY-MM-DDTHH:MM:SS
match = re.compile(r"[0-9]+-[0-9]+-[0-9]+T[0-9]+:[0-9]+:[0-9]+").match(date)

if match is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if match is not None:
if not match:
return

Comment on lines 57 to 65
if (
int(year_month_day[0]) > datetime.datetime.now().year
or int(year_month_day[1]) > 12
or int(year_month_day[2]) > 31
or int(hour_minute_second[0]) > 23
or int(hour_minute_second[1]) > 59
or int(hour_minute_second[2]) > 59
):
raise InvalidDateError(date)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a function and call it like is_invalid(year_month_day, hour_minute_second)

Comment on lines 91 to 98
if (
int(year_month_day[0]) > datetime.datetime.now().year
or int(year_month_day[1]) > 12
or int(year_month_day[2]) > 31
or int(hour_minute_second[0]) > 23
or int(hour_minute_second[1]) > 59
):
raise InvalidDateError(date)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function is_invalid

Comment on lines 152 to 157
if (
int(year_month_day[0]) > datetime.datetime.now().year
or int(year_month_day[1]) > 12
or int(year_month_day[2]) > 31
):
raise InvalidDateError(date)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function is_invalid (you'll need to adjust the function to work with only one of the two types)

# YYYY-MM-DD
match = re.compile(r"[0-9]+-[0-9]+-[0-9]+").match(date)

if match is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if match is not None:
if not match:
return

Comment on lines 70 to 71
geojson_from = geojson_one.copy()
geojson_into = geojson_two.copy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe geojson_src and geojson_dest instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. Thank you.

Comment on lines 62 to 68
def merge_geojson(
geojson_one: dict,
geojson_one_key: str,
geojson_two: dict,
geojson_two_key: str,
debug=True,
) -> dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a docstring, it's not evident what it does just by looking at the name (maybe consider changing the name to mutually_update_gojsons or something)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in the latest commit.

geojson_one_key: str,
geojson_two: dict,
geojson_two_key: str,
debug=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value of Debug should not be True

Suggested change
debug=True,
debug=False,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed debug all together. For testing purposes, I forgot to remove it.

Comment on lines 62 to 122
def merge_geojson(
geojson_one: dict,
geojson_one_key: str,
geojson_two: dict,
geojson_two_key: str,
debug=True,
) -> dict:

geojson_from = geojson_one.copy()
geojson_into = geojson_two.copy()

list_one = []
list_two = []

for from_features in geojson_from["features"]:

for into_features in geojson_into["features"]:

if (
geojson_two_key not in into_features["properties"]
or geojson_one_key not in from_features["properties"]
):
# * REFACTOR: This is better if it can be logged
# * in the future
continue

list_one.append(int(from_features["properties"][geojson_one_key]))
list_two.append(int(into_features["properties"][geojson_two_key]))

if debug:
print(
int(from_features["properties"][geojson_one_key])
== int(into_features["properties"][geojson_two_key]),
int(into_features["properties"][geojson_two_key]),
int(from_features["properties"][geojson_one_key]),
)

if int(from_features["properties"][geojson_one_key]) == int(
into_features["properties"][geojson_two_key]
):

old_properties = [key for key in from_features["properties"].keys()]

new_properties = [key for key in into_features["properties"].keys()]

if debug:
print(old_properties)
print(new_properties)

for feature in new_properties:

if feature not in old_properties:

from_features["properties"][feature] = old_properties[
"properties"
][feature]

if debug:
print(list(set(list_one) & set(list_two)))

return geojson_from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this entire function could be done using a feature native of Python:

return {**geojson_one, **geojson_two}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly. I wish to join based on exact ID key matches, similar to an SQL join. Performing return {**geojson_one, **geojson_two} would join all of the keys together and not care for the properties aspect of the GeoJSON

Comment on lines 103 to 117
old_properties = [key for key in from_features["properties"].keys()]

new_properties = [key for key in into_features["properties"].keys()]

if debug:
print(old_properties)
print(new_properties)

for feature in new_properties:

if feature not in old_properties:

from_features["properties"][feature] = old_properties[
"properties"
][feature]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
old_properties = [key for key in from_features["properties"].keys()]
new_properties = [key for key in into_features["properties"].keys()]
if debug:
print(old_properties)
print(new_properties)
for feature in new_properties:
if feature not in old_properties:
from_features["properties"][feature] = old_properties[
"properties"
][feature]
for feature in into_features["properties"].keys():
if feature not in from_features["properties"].keys():
from_features["properties"][feature] = old_properties["properties"][feature]

@Rubix982
Copy link
Contributor Author

Rubix982 commented Jul 9, 2021

Ayeee, @gmelodie ! Thank you. :D

Copy link
Member

@cbeddow cbeddow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No technical comments from me, looks like several from @gmelodie for addressing. The format.py I think offers good potential for the future in case other formats are added. May consider leaving room for the future if a user wants to export to a local file in .csv or .shp format, they could import the needed drivers (like via fiona if it were to be brought back) here and then always convert from geojson into the final export format. This is different from the JSON transformations going on here now, of course, but related.

@Rubix982
Copy link
Contributor Author

@gmelodie your review, please

@OmarMuhammedAli Can I improve this bit of logic even further?

@Rubix982 Rubix982 added this to Pull Requests - In Progress in MLH Fellowship Jul 11, 2021
@Rubix982
Copy link
Contributor Author

@gmelodie review plisssssssssss

refactored to list comprehension for better performance
Copy link
Contributor

@OmarMuhammedAli OmarMuhammedAli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some logical errors on format.py that I fixed to merge the PR. Great job!

@OmarMuhammedAli OmarMuhammedAli merged commit 04ec953 into main Jul 14, 2021
@OmarMuhammedAli OmarMuhammedAli deleted the Utilities-Proposal-2 branch July 14, 2021 23:33
@OmarMuhammedAli OmarMuhammedAli moved this from Pull Requests - In Progress to Done in MLH Fellowship Jul 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. utility
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants