Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TimesModel.data_years() throws exception for non-int years #217

Open
SamRWest opened this issue Mar 13, 2024 · 4 comments
Open

TimesModel.data_years() throws exception for non-int years #217

SamRWest opened this issue Mar 13, 2024 · 4 comments
Assignees

Comments

@SamRWest
Copy link
Collaborator

SamRWest commented Mar 13, 2024

Austimes has some non-integer-parseable year values (e.g. 2070-EOH) which this function fails on:

pd.unique(attributes["year"])
array([0, 2015.0, 2016.0, 2017.0, 2018.0, 2019.0, 2020.0, 2021.0, 2022.0,
       2023.0, 2024.0, 2025.0, 2026.0, 2027.0, 2028.0, 2029.0, 2030.0,
       2031.0, 2032.0, 2033.0, 2034.0, 2035.0, 2036.0, 2037.0, 2038.0,
       2039.0, 2040.0, 2041.0, 2042.0, 2043.0, 2044.0, 2045.0, 2046.0,
       2047.0, 2048.0, 2049.0, 2050.0, 2051.0, 2052.0, 2053.0, 2054.0,
       2055.0, 2056.0, 2057.0, 2058.0, 2059.0, 2060.0, 2013.0, 2011.0,
       2009.0, 2007.0, 2005.0, 2101.0, 2012, '2070-EOH', 2061, 2010, 2014,
       '2024-EOH', '2045-EOH', 'BOH-EOH', 1980, 1981, 1982, 1983, 1984,
       1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995,
       1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2006, 2008],
      dtype=object)

If the intention was to collect all the valid years, this could be fixed by updating to:

data_years.update(attributes["year"].astype(int).values)

to:

 int_years = attributes["year"].astype(int, errors='ignore') # leave non-parseable vals alone
 int_years = [y for y in int_years if isinstance(y, int)]  # remove non-parseable years
 data_years.update(int_years.values)

Would this work, or would we need to interpret/expand these ranges?

@olejandro
Copy link
Member

olejandro commented Mar 14, 2024

Interesting! Dash-separated values should not have reached this far... We should definetely expand them. Could you check in raw_tables.txt the tag of the table they are originating from?

@olejandro olejandro changed the title data_types.data_years() throws exception for non-int years TimesModel.data_years() throws exception for non-int years Mar 14, 2024
@SamRWest
Copy link
Collaborator Author

Interesting! Dash-separated values should not have reached this far... We should definetely expand them. Could you check in raw_tables.txt the tag of the table they are originating from?

Sure thing, these are all the spots they appear in raw_tables.txt

range: L12:R16
filename: benchmarks/xlsx\../../../austimes-lfs\SuppXLS/Scen_Par-austimes_CCA.xlsx
tag: ~TFM_INS
types: Attribute (object), PSET_SET (object), PSET_PN (object), CSET_CN (object), Year (object), AllRegions (object)
Attribute,PSET_SET,PSET_PN,CSET_CN,Year,AllRegions
PRC_FOFF,,T*Dsl,"TraDsl,TraB20",2070-EOH,1
PRC_FOFF,,T*Pet,"TraPet,TraE10",2070-EOH,1
PRC_FOFF,,T*Lpg,TraLpg,2070-EOH,1
range: B72:F74
filename: benchmarks/xlsx\../../../austimes-lfs\SuppXLS/Scen_nsv-austimes_csiro_1.xlsx
uc_sets: {'R_S': 'NSA,ADE,SESA,NQ,CQ,SWQ,SEQ,LV,MEL,CVIC,NVIC,SWNSW,NNS,NCEN,CAN,TAS,ACT,NSW,QLD,SA,VIC'}
tag: ~TFM_INS
types: attribute (object), process (object), year (object), value (object)
attribute,process,year,value
PRC_AOFF,"EE_StmTurb312,EE_StmTurb313",2024-EOH,1
range: L10:R14
filename: benchmarks/xlsx\../../../austimes-lfs\SuppXLS/Scen_nsv-austimes_csiro_1.xlsx
tag: ~TFM_INS
types: Attribute (object), PSET_SET (object), PSET_PN (object), CSET_CN (object), Year (object), AllRegions (object)
Attribute,PSET_SET,PSET_PN,CSET_CN,Year,AllRegions
PRC_FOFF,,T*Dsl,"TraDsl,TraB20",2045-EOH,1
PRC_FOFF,,T*Pet,"TraPet,TraE10",2045-EOH,1
PRC_FOFF,,T*Lpg,TraLpg,2045-EOH,1
range: S22:AB25
filename: benchmarks/xlsx\../../../austimes-lfs\SuppXLS/Scen_TransportPolicies.xlsx
uc_sets: {'R_E': 'allregions'}
tag: ~TFM_INS
types: attribute (object), CSET_CN (object), year (object), allregions (object), NSW (object), QLD (object), VIC (object), PSET_CO (object), PSET_CI (object)
attribute,CSET_CN,year,allregions,NSW,QLD,VIC,PSET_CO,PSET_CI
PRC_FOFF,TraE10,BOH-EOH,1,0,0,0,TraE10,
PRC_FOFF,TraE10,BOH-EOH,1,,,,-T_pas?,TraE10

@olejandro
Copy link
Member

Thanks! We currently expand dash-separated values only for FI_T tables (lines 542-555 in process_flexible_import_tables. The easiest fix would be to extract that code to a separate tranform and apply it also to TFM_INS tables.

@SamRWest
Copy link
Collaborator Author

No worries.
Thanks for the pointer, I'll look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants