Skip to content

Conversation

@jlchang
Copy link
Contributor

@jlchang jlchang commented Nov 7, 2019

This PR implements writing metadata validation errors to a local file that is then delocalized to the google bucket belonging to the corresponding cell_metadata file.

Implementation notes:
• validation error output file (local or bucket) is overwritten with each new validation run
• added self.local_file_path (ingest_files.py, resolve_path) to address issue with opening metadata file located in google bucket when open_as "dataframe"
• delocalize_error_file and the conditions of its invocation in ingest_pipeline.py currently hard-coded specific to metadata ingest case. Refactoring will be needed if we're capturing ingest errors of other file types.

This PR also includes:
• updated metadata validation tests to conform to latest metadata convention (AMC_v1.1.3)

This fulfills SCP-1969

Copy link
Member

@eweitz eweitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for the walk-through.

@codecov
Copy link

codecov bot commented Nov 14, 2019

Codecov Report

Merging #47 into master will increase coverage by 0.06%.
The diff coverage is 65%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #47      +/-   ##
=========================================
+ Coverage   63.74%   63.8%   +0.06%     
=========================================
  Files          11      11              
  Lines        1208    1235      +27     
=========================================
+ Hits          770     788      +18     
- Misses        438     447       +9
Impacted Files Coverage Δ
ingest/validation/validate_metadata.py 83.14% <100%> (+0.7%) ⬆️
ingest/ingest_pipeline.py 39.83% <13.33%> (-0.88%) ⬇️
ingest/ingest_files.py 86.77% <75%> (-0.52%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e5e55d...1688688. Read the comment docs.

@jlchang jlchang merged commit 4e9e9ca into master Nov 14, 2019
@jlchang jlchang deleted the jlc_error2bucket branch November 14, 2019 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants