Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context mapping #120

Merged
merged 116 commits into from
Mar 18, 2021
Merged

Context mapping #120

merged 116 commits into from
Mar 18, 2021

Conversation

bl-young
Copy link
Collaborator

@bl-young bl-young commented Mar 10, 2021

-Refactoring for building mapping files to support distinct mappings for flows and contexts
-Add a separate folder for mapping input files to separate from flow list input files
-Mapping for ImpactWorld+ (LCIAfmt: PR 43)
-Refactor to consolidate code with new functions: add_uuid_to_mapping, add_conversion_to_mapping
-New flows to support USDA_CUS, and addition to this mapping file
-Expanded biological secondary contexts
-New tests for duplicate flows and contexts across classes

Note: changes to some mapping files (NEI, TRI, ReCiPe) primarily reflect reordering following generation by script

hottleta and others added 30 commits October 14, 2020 15:08
Update NEI_FlowableMappings.csv
Update NEI_FlowableMappings.csv
…ld NEI/TRI flows to flow mapping files, and created new mapping files for NEI/TRI for QA
…t files and add 1 to all blank/nan entries prior to applying default conversion factors where applicable
…maintains warning to check for duplicate flows in source flowables list
ashleyedelen and others added 18 commits March 11, 2021 12:44
Added air compartment for pesticides that had previously been in the list before the additional pesticides from USDA was added.
updated chemical air emissions for pesticides
added s-kinoprene emission/air
added borax as an emission to air
added necessary geological air emissions categories for pesticides work
added Silica primary context to air for pesticide work
edits to finalize pesticide mapping petroleum oil is mapped to hydrocarbons, petroleum and silicon dioxide to silica
addition of water primary contexts to insecticide lists
updated geological secondary context for emissions based on pesticides update
Copy link
Collaborator

@WesIngwersen WesIngwersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the flow mapping unit tests are failing for me

583236 != 583296

Expected :583296
Actual   :583236
<Click to see difference>

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm 2020.2.1\plugins\python\helpers\pycharm\teamcity\diff_tools.py", line 32, in _patched_equals
    old(self, first, second, msg)
  File "C:\Users\wesle\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 839, in assertEqual
    assertion_func(first, second, msg=msg)
  File "C:\Users\wesle\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 832, in _baseAssertEqual
    raise self.failureException(msg)
AssertionError: 583296 != 583236

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\wesle\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 59, in testPartExecutor
    yield
  File "C:\Users\wesle\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 615, in run
    testMethod()
  File "C:\Users\wesle\Federal-LCA-Commons-Elementary-Flow-List\tests\test_flow_mappings.py", line 31, in test_no_nas_in_required_fields
    self.assertEqual(len(flowmappings_w_required), len(nas_in_required))



582627 != 583296

Expected :583296
Actual   :582627
<Click to see difference>

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm 2020.2.1\plugins\python\helpers\pycharm\teamcity\diff_tools.py", line 32, in _patched_equals
    old(self, first, second, msg)
  File "C:\Users\wesle\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 839, in assertEqual
    assertion_func(first, second, msg=msg)
  File "C:\Users\wesle\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 832, in _baseAssertEqual
    raise self.failureException(msg)
AssertionError: 583296 != 582627

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\wesle\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 59, in testPartExecutor
    yield
  File "C:\Users\wesle\AppData\Local\Programs\Python\Python37\lib\unittest\case.py", line 615, in run
    testMethod()
  File "C:\Users\wesle\Federal-LCA-Commons-Elementary-Flow-List\tests\test_flow_mappings.py", line 41, in test_targetflowinfo_matches_flows_in_list
    self.assertEqual(len(flowmapping_targetinfo), len(flowmappings_w_flowlist))

@WesIngwersen
Copy link
Collaborator

WesIngwersen commented Mar 16, 2021

Missing flows from test_no_nas_in_required_fields


SourceListName | SourceFlowName | SourceFlowContext | SourceUnit | TargetFlowName | TargetFlowUUID | TargetFlowContext | TargetUnit
-- | -- | -- | -- | -- | -- | -- | --
EIA_MER | Coal Production | MJ | Coal | 39a07dad-33e8-3644-b6a5-2054c3d53fa9 | resource/ground | MJ
EIA_MER | Natural Gas (Dry) Production | MJ | Natural gas | ae604834-c49a-3ebb-b15b-06e6e89e427e | resource/air | MJ
EIA_MER | Natural Gas Plant Liquids Production | MJ | Natural gas | ae604834-c49a-3ebb-b15b-06e6e89e427e | resource/air | MJ
EIA_MER | Crude Oil Production | MJ | Crude oil | 4aca0c2c-6e1a-3d90-956c-7141ca8932ce | resource/water | MJ
EIA_MER | Primary energy - nuclear | MJ | Uranium | af4664da-60f0-31e5-83ba-0d2518959c5b | resource/ground | kg
EIA_MER | Primary energy - geothermal | MJ | Energy, geothermal | d2792697-9e22-3380-974f-54b92a3549a9 | resource/ground | MJ
EIA_MER | Primary energy - biomass | MJ | Biomass | 7feeb363-fbeb-37ad-937f-080834b9dc35 | resource/biotic | kg
EIA_MER | Primary energy - hydro | MJ | Energy, hydro | 5af7a834-bf92-32eb-a0b2-2f2dcdc9f3d9 | resource/water | MJ
EIA_MER | Primary energy - solar | MJ | Energy, solar | 4d1571a6-ffff-3a36-82df-224ed975a094 | resource/air | MJ
EIA_MER | Primary energy - wind | MJ | Energy, wind | 35962866-662b-3817-a2a7-39375c7e6a3c | resource/air | MJ
USGS_MCS | Copper; mine | kg | Copper | fddcce55-84af-3798-81cc-47fedcf811b8 | resource/ground | kg
USGS_MCS | Lead; mine, concentrates | kg | Lead | b66af8c0-de54-3a35-ace0-7743af98f6de | resource/ground | kg
USGS_MCS | Nickel; mine | kg | Nickel | d1ce496a-891d-3fae-9188-7345718cd600 | resource/ground | kg
USGS_MCS | Zinc; mine, zinc in concentrate | kg | Zinc | 7dbe300b-d061-3e5e-a9f0-1813e89d415c | resource/ground | kg
USGS_MCS | Lime; lime | kg | Limestone | 13f50f4e-6aee-34f6-b3af-20503373f75d | resource/ground | kg
USGS_MCS | Sand and Gravel; sand and gravel   (construction) | kg | Sand | 96ad9cfe-d563-33ca-8dbc-efe34aaaca0a | resource/ground | kg
USGS_MCS | Sand and Gravel; sand and gravel   (industrial) | kg | Sand | 96ad9cfe-d563-33ca-8dbc-efe34aaaca0a | resource/ground | kg
USGS_MCS | Sand and Gravel; sand and gravel   (construction) | kg | Gravel | dd983461-0533-301e-bde1-e1d5f1d55c4e | resource/ground | kg
USGS_MCS | Sand and Gravel; sand and gravel   (industrial) | kg | Gravel | dd983461-0533-301e-bde1-e1d5f1d55c4e | resource/ground | kg
USGS_MCS | Stone; Stone (crushed) | kg | Stone | 26484009-712e-32d1-8f80-456363c6c226 | resource/ground | kg
USGS_MCS | Stone; Stone (dimension) | kg | Stone | 26484009-712e-32d1-8f80-456363c6c226 | resource/ground | kg
USGS_MCS | Beryllium; mine shipments | kg | Beryllium | f7ff85d1-1dc5-31d1-8c08-47bd566a32b4 | resource/ground | kg
USGS_MCS | Cobalt; cobalt content | kg | Cobalt | aec2cf2a-b55b-3598-a164-05f8ba8ad3ee | resource/ground | kg
USGS_MCS | Gold; mine | kg | Gold | affc3f7b-6f37-3ff0-beb0-981ec1567bc3 | resource/ground | kg
USGS_MCS | Iron Ore, US production | kg | Iron ore | 6fca43a7-13fb-315d-a184-8af7d40ff1d3 | resource/ground | kg
USGS_MCS | Magnesium; magnesium compounds | kg | Magnesium | 8284cb6c-7887-322d-8abd-8682e981791a | resource/ground | kg
USGS_MCS | Molybdenum; mine | kg | Molybdenum | b7643fb5-0299-371b-b148-3e24ab1c469e | resource/ground | kg
USGS_MCS | Platinum group metals; palladium | kg | Palladium | c84f9368-f27b-3698-badf-84728a9a170b | resource/ground | kg
USGS_MCS | Platinum group metals; platinum | kg | Platinum | 12aaa508-a741-36a0-99e5-b2ebca11c98b | resource/ground | kg
USGS_MCS | Rare Earths; bastnasite concentrates | kg | Bastnaesite | 50bc0c18-20f8-3351-a1b0-29816a96703b | resource/ground | kg
USGS_MCS | Rhenium; rhenium | kg | Rhenium | 7e3a1fea-3393-3377-a83f-5210b4368090 | resource/ground | kg
USGS_MCS | Silver; mine | kg | Silver | c1469141-98f7-3068-bd5d-1993b7d762d3 | resource/ground | kg
USGS_MCS | Titanium and Titanium Dioxide;   mineral concentrate | kg | Titanium dioxide | d112a9f2-c53e-30b4-8129-507aedd1882f | resource/ground | kg
USGS_MCS | Zirconium and Hafnium; zirconium,   ores and concentrates | kg | Zirconium | 3ca65344-7d94-3bc1-8454-32d2e77766f4 | resource/ground | kg
USGS_MCS | Barite; sold or used, mine | kg | Barite | 09ce2596-a9f3-3373-9049-342d0e8929a3 | resource/ground | kg
USGS_MCS | Boron |   | kg | Boron | 4c4afdaf-8575-3c7c-bf8f-4ddea29fd682 | resource/ground | kg
USGS_MCS | Clays; Ball clay | kg | Clay | 2c6a4656-52d3-3fbf-9e1b-84a047d4c78c | resource/ground | kg
USGS_MCS | Clays; Common clay | kg | Clay | 2c6a4656-52d3-3fbf-9e1b-84a047d4c78c | resource/ground | kg
USGS_MCS | Clays; Fire clay | kg | Clay | 2c6a4656-52d3-3fbf-9e1b-84a047d4c78c | resource/ground | kg
USGS_MCS | Clays; Fuller's earth | kg | Clay | 2c6a4656-52d3-3fbf-9e1b-84a047d4c78c | resource/ground | kg
USGS_MCS | Clays; Bentonite | kg | Bentonite | 72e54433-a3d9-348d-aac4-684499fea92b | resource/ground | kg
USGS_MCS | Clays; Kaolin | kg | Kaoline | 5610a066-f8c1-3a18-9255-70a7b07e78df | resource/ground | kg
USGS_MCS | Diatomite; diatomite | kg | Diatomite | 8db22f24-6361-3d68-8aa4-4cad5d7db52c | resource/ground | kg
USGS_MCS | Feldspar; marketable | kg | Feldspar group | ebe0c1c3-9d31-3099-b4ac-c5519c2166e3 | resource/ground | kg
USGS_MCS | Fluorspar; fluorspar equivalent from   phosphate rock | kg | Fluorite | ea940079-0b1a-3e26-9013-c5ace83fa0f5 | resource/ground | kg
USGS_MCS | Garnet (Industrial); crude | kg | Garnet group | 08892041-9412-351c-a166-9f0966dbd29d | resource/ground | kg
USGS_MCS | Gypsum; crude | kg | Gypsum | 1f8ed538-0c45-3e06-97cc-70033132a450 | resource/ground | kg
USGS_MCS | Kyanite and related; mine | kg | Kyanite | cc6e343d-b2e6-38f2-bdf8-21a23e49ee97 | resource/ground | kg
USGS_MCS | Lithium; lithium | kg | Lithium | 4265cb8a-11b6-3949-90ee-9eec7a7ec5a3 | resource/ground | kg
USGS_MCS | Mica; mine | kg | Mica | 432ca42e-8fd0-3a2a-b37b-d66aebdd048d | resource/ground | kg
USGS_MCS | Peat; peat | kg | Peat | 6e414092-5f92-3cd9-afaa-94ddeacb895b | resource/ground | MJ
USGS_MCS | Perlite; perlite | kg | Perlite | 66c6fc98-3707-3b5f-80f2-e91df2ac7912 | resource/ground | kg
USGS_MCS | Phosphate Rock; marketable | kg | Phosphate ore | 1ab24e9c-18f9-3aeb-a968-d1c6f25a1ef4 | resource/ground | kg
USGS_MCS | Potash; marketable | kg | Potassium oxide | 9e478ab2-41b2-391e-9c0a-a0d4b38fdd74 | resource/ground | kg
USGS_MCS | Pumice and Pumicite; mine | kg | Pumice | 939fb88f-7bf5-36fe-a417-4610cf1ff3f8 | resource/ground | kg
USGS_MCS | Salt; salt |   | kg | Sodium chloride | f338dd70-dcd0-33e9-b26a-ba5c8042f793 | resource/ground | kg
USGS_MCS | Soda Ash; soda ash | kg | Sodium carbonate | 546de5ab-4d9b-3a33-92c5-5425ec52249a | resource/ground | kg
USGS_MCS | Talc and pyrophyllite; mine | kg | Talc | acd4abb1-54fc-3c02-ab6e-36d8822062ae | resource/ground | kg
USGS_MCS | Vermiculite; vermiculite | kg | Vermiculite | 19be0c5c-690b-32d0-a36e-be3e629a4518 | resource/ground | kg
USGS_MCS | Zeolites; zeolites, natural | kg | Zeolites | 414aa017-7696-316e-802b-dcc3c7fb36d1 | resource/ground | kg

@bl-young
Copy link
Collaborator Author

Yes on the first failure, these flow mappings for USEEIOr don't have source flow contexts. I'll revise them to match the satellite tables.

On the second failure, if you update your parquet it should be resolved. I did not yet push a v1.0.7 parquet but I can on this branch.

@WesIngwersen
Copy link
Collaborator

FYI in flowlist.py I'm getting this INFO message

INFO 145 flows with multiple alt unit; these duplicates have been removed:
         Uranium
         Methane
 Coalbed methane
     Natural gas

the flowlist was originally designed only to allow one alternate unit. It will be an enhancement if we allow multiple but they have to be associated with a single flow, so the format of the alt units would likely have to change - perhaps to a list. In the meantime, only one alt unit will be kept

@WesIngwersen
Copy link
Collaborator

WesIngwersen commented Mar 16, 2021

Here is a nice helpful post for dealing with lists within pd dataframes if we go that route
https://towardsdatascience.com/dealing-with-list-values-in-pandas-dataframes-a177e534f173

@WesIngwersen
Copy link
Collaborator

Yes on the first failure, these flow mappings for USEEIOr don't have source flow contexts. I'll revise them to match the satellite tables.

On the second failure, if you update your parquet it should be resolved. I did not yet push a v1.0.7 parquet but I can on this branch.

I confirmed that the second test test_targetflowinfo_matches_flows_in_list passes once the list is regenerated. thanks for the tip.

@bl-young
Copy link
Collaborator Author

Here is a nice helpful post for dealing with lists within pd dataframes if we go that route
https://towardsdatascience.com/dealing-with-list-values-in-pandas-dataframes-a177e534f173

Yes a list might be a better way to go ultimately. Right now the flow list parquet only can handle a single alternate unit. But when the list is generated in JSON the additional alternate units get re-added. That logger line may be better as a debug than an info given the potential confusion it might cause, as flows aren't really getting removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants