-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added partial update capability to sqlIngest #204
Conversation
--Now able to specify page size, allowing larger pages if necessary
Bring dev up to date with upstream
--orm configuration files now in databaseOrm.py package --updateDatabase method of DataHandler now allows partial updates of records in database --runs roughly 90 seconds per 1000 records --around 5000 new records can be expected per day --fixed fetchSocrata method of DataHandler to correctly grab number of records specified in querySize parameter
Bringing up to date with main repo
--documented updateDatabase method --added a couple syntax/linting fixes to sqlIngest file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh snap...does this solve the upsert issue????????? |
Should do. It's slow, but it more or less works. Looks like sqlAlchemy doesn't have native upsert, so you have to jury rig a solution. There might be a more efficient way, but we should be able to grab the latest couple thousand records daily with this. |
Also oops on the linting. Looks like the linted file wasn't committed for some reason. |
--New functionality documented in /Documentation/ folder
--Now able to page arbitrary number of most recent records from a socrata store and integrate them into database
--Does not check whether records are changed, will update all records in most recent pull
--Currently runs somewhat slowly (~45 seconds/1000 updates) but this may be a limitation of sqlAlchemy