Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added partial update capability to sqlIngest #204

Merged
merged 9 commits into from
Jan 26, 2020
Merged

Added partial update capability to sqlIngest #204

merged 9 commits into from
Jan 26, 2020

Conversation

ryanmswan
Copy link
Contributor

--New functionality documented in /Documentation/ folder
--Now able to page arbitrary number of most recent records from a socrata store and integrate them into database
--Does not check whether records are changed, will update all records in most recent pull
--Currently runs somewhat slowly (~45 seconds/1000 updates) but this may be a limitation of sqlAlchemy

--Now able to specify page size, allowing larger pages if necessary
Bring dev up to date with upstream
--orm configuration files now in databaseOrm.py package
--updateDatabase method of DataHandler now allows partial updates of records in database
  --runs roughly 90 seconds per 1000 records
  --around 5000 new records can be expected per day
--fixed fetchSocrata method of DataHandler to correctly grab number of records specified in querySize parameter
Bringing up to date with main repo
--documented updateDatabase method
--added a couple syntax/linting fixes to sqlIngest file
Copy link
Member

@sellnat77 sellnat77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sellnat77
Copy link
Member

Oh snap...does this solve the upsert issue?????????

@ryanmswan
Copy link
Contributor Author

Oh snap...does this solve the upsert issue?????????

Should do. It's slow, but it more or less works. Looks like sqlAlchemy doesn't have native upsert, so you have to jury rig a solution. There might be a more efficient way, but we should be able to grab the latest couple thousand records daily with this.

@ryanmswan
Copy link
Contributor Author

Also oops on the linting. Looks like the linted file wasn't committed for some reason.

@sellnat77 sellnat77 self-requested a review January 26, 2020 02:38
@sellnat77 sellnat77 merged commit 6bcda51 into hackforla:dev Jan 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants