You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, if the same data file is ingested a second time there are situations where the 'exists' check fails because the file being ingested is compared to only the most recent file (df_functions.py, def:updatedatafile, lines 81-89 of v0.2.1).
Therefore, the code needs to be updated to check the new data file against all versions that have been ingested. This should be done using the new 'jhash' field already added to the 'json_files' table, where an md5 hash of the 'file' field is stored. Although the 'jhash' field might well be unique across the table, using the 'file_lookup_id' and the 'jhash' to search the table would verify if the file had already been uploaded.
Note: in code the current 'generatedAt' field must be emptied (set to '') before the md5 hash generation.
The text was updated successfully, but these errors were encountered:
Currently, if the same data file is ingested a second time there are situations where the 'exists' check fails because the file being ingested is compared to only the most recent file (df_functions.py, def:updatedatafile, lines 81-89 of v0.2.1).
Therefore, the code needs to be updated to check the new data file against all versions that have been ingested. This should be done using the new
'jhash'
field already added to the'json_files'
table, where an md5 hash of the'file'
field is stored. Although the'jhash'
field might well be unique across the table, using the'file_lookup_id'
and the'jhash'
to search the table would verify if the file had already been uploaded.Note: in code the current
'generatedAt'
field must be emptied (set to '') before the md5 hash generation.The text was updated successfully, but these errors were encountered: