Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Extract missing values (to metadata) #125

Closed
wants to merge 6 commits into from

Conversation

roll
Copy link
Contributor

@roll roll commented Feb 27, 2020

@coveralls
Copy link

coveralls commented Feb 27, 2020

Pull Request Test Coverage Report for Build 441

  • 35 of 37 (94.59%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 85.203%

Changes Missing Coverage Covered Lines Changed/Added Lines %
dataflows/processors/load.py 35 37 94.59%
Totals Coverage Status
Change from base Build 439: 0.2%
Covered Lines: 1739
Relevant Lines: 2041

💛 - Coveralls

missing_values = schema.get('missingValues', [])
if not self.extract_missing_values['values']:
self.extract_missing_values['values'] = missing_values
for field in schema.get('fields', []):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you changing the schema?
It's not explained in the README

target = self.extract_missing_values['target']
values = self.extract_missing_values['values']
fieldMap = {}
for resource in self.dp.descriptor['resources']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this loop?
can't you change descriptor directly?

values = self.extract_missing_values['values']
fieldMap = {}
for resource in self.dp.descriptor['resources']:
if resource.get('name', 'res') == descriptor.get('name', 'des'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bit hacky...

if source and key not in source:
continue
if value in values:
fieldMap[key][target][row_number] = value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the empty values go in the schema??
This really contradict the documentation and makes very little sense to me...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to preserve missingValues in dump
3 participants