Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added flag for clean columns for joining table dataset with source dataset #302

Merged

Conversation

ywkim312
Copy link
Member

@longshuicy @jonglee1 @navarroc There was already an method for joining the table dataset with source dataset. I just added another variable to clean up the columns. If the new variable is true, it will remove all the columns from source dataset except 'geometry' and 'guid'. So eventually. the output geodataframe will only have the columns from table dataset with geometry. The user can easily convert this GeoDataFrame to shapefile or geopakage using geopands.

I am not very sure how we will handle this join process at this moment. We probably need to discuss more how to do it. However, having this PR will be helpful in some cases

@ywkim312 ywkim312 self-assigned this Mar 23, 2023
@ywkim312 ywkim312 added this to the Sprint 8 milestone Mar 23, 2023
@longshuicy
Copy link
Member

@ywkim312 Good to know that we already have a method for that. Im most curious about how fast it is? Could you have some benchmark for the SLC data I shared?

@ywkim312
Copy link
Member Author

@ywkim312 Good to know that we already have a method for that. Im most curious about how fast it is? Could you have some benchmark for the SLC data I shared?

Actually, it is very hard to do make a fair comparison because this one happens in the local machine but the service's join happens in the server side. However, using this code in pyincore is very meaningful because it will reduce the workload from the server significantly and we can avoid some errors like time out.

pyincore/utils/datasetutil.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ylyangtw ylyangtw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test ran fine. Changes look good. Approve~

@ywkim312 ywkim312 merged commit 6435f71 into develop May 14, 2024
7 checks passed
@ywkim312 ywkim312 deleted the 299-investigate-joincreating-shapefiles-using-pandas branch May 14, 2024 20:22
@ywkim312 ywkim312 restored the 299-investigate-joincreating-shapefiles-using-pandas branch May 14, 2024 20:22
@ywkim312 ywkim312 deleted the 299-investigate-joincreating-shapefiles-using-pandas branch May 14, 2024 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize joining process by exploring pyincore-side joining with geopandas
3 participants