Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with Smkinven daily area source processing #91

Closed
callen52 opened this issue Jan 30, 2024 · 12 comments
Closed

problems with Smkinven daily area source processing #91

callen52 opened this issue Jan 30, 2024 · 12 comments

Comments

@callen52
Copy link

I've found a couple of issues with daily area source processing in Smkinven.

  1. When SMK_PROCESS_HAPS = ALL, we get this error in Smkinven (caused by all VOC being converted to NONHAPVOC):
    INTERNAL ERROR: Could not find variable "VOC" in map file
    We are working around this by setting SMK_PROCESS_HAPS = PARTIAL and only including out-of-domain sources in the NHAPEXCLUDE so as to not affect the modeling.

  2. For March, everything runs fine until it tries to process April 1 (timestep 2021091 in this case), at which point Smkinven returns a bunch of bogus "Both VOC|TOG and toxics are not found" errors for sources which in fact have both VOC and toxics. This only affects timesteps from 2021091:040000 onwards, so this doesn't actually affect March modeling. Smkinven for April works just fine.

I'm attaching some sample input data and Smkinven logs for the two issues.
for_daily_processing_smoke_bug_report_30jan2024.zip

@hnqtran
Copy link
Contributor

hnqtran commented Feb 2, 2024

Hi Christine. I'm wanting to replicate the issue that you are having. Could you please share with me your run script along with its input files?

@callen52
Copy link
Author

callen52 commented Feb 2, 2024

Hi Huy,

Here is the run script (Monthly_area_...), COSTCY, INVTABLE, and an updated version of the smk_ar_monthly_emf.csh helper script to support this type of processing. I think this plus the the contents of what I originally posted should give you all you need to run Smkinven, but let me know if you're still missing anything.

costcy_for_2017platform_20dec2023_v8.txt
invtable_2017_NATA_CMAQ_26apr2023_v7.txt
Monthly_area_Q1_livestock_12US1_2021hb_cb6_21k_20240130092302.csh.txt
smk_ar_monthly_emf.csh.txt

@hnqtran
Copy link
Contributor

hnqtran commented Feb 8, 2024

Hi Christine,

I was able to replicate the first issue that you reported, and I found that the solution to issue #82 would solve this. Would you please make the modifications to SETNONHAP subrountine, recompile smkinven and let me know if it work?

I'm trying to get to the second issue and let will you know if I find anything. Thanks.

Huy

hnqtran pushed a commit that referenced this issue Feb 15, 2024
@hnqtran
Copy link
Contributor

hnqtran commented Feb 15, 2024

Progress update:

  • The first issue has been solved with modification made to $SMOKE_HOME/src/smkinven/setnonhap.f QA check on livestock output premerged files before and after bug-fixed (run with SMK_PROCESS_HAPS "PARTIAL") show value differences are in acceptable range < 0.001% (see livestock.HAPS_patial.orig_vs_fixed.m3diff.txt). No value differences found in bug-fixed version between running with SMK_PROCESS_HAPS "PARTIAL" and SMK_PROCESS_HAPS "ALL" (see livestock.fixed.HAPS_patial_vs_all.m3diff.txt), which is expected when using the dummy NHAPEXCLUDE file from Christine.

  • I was able to replicate the 2nd issue but have not developed a fix for this one.

livestock.HAPS_patial.orig_vs_fixed.m3diff.txt
livestock.fixed.HAPS_patial_vs_all.m3diff.txt

@hnqtran
Copy link
Contributor

hnqtran commented Feb 22, 2024

Progress update:
There are two daily emission input files from Christine's test packages: livestock_2021hb_FEM_daily_test.csv and livestock_2021hb_nonFEM_daily_test.csv. In these two files, emissions entries for some FIPS and SCCs are unsorted, e.g. their entires for January through May are in one part of the files, and entries for June - December are in another part of the files.

As the livestock_2021hb_FEM_daily_test.csv was sorted (attached), i.e. emission entries for each combination of FIPS, SCC and pollutant are put in 12 consecutive lines for January - December, smkinven ran successfully.

However, smkinven still failed with error "Found VOC|TOG but no toxics found" when running with the livestock_2021hb_nonFEM_daily_test.csv even after it was sorted.

Some sort functions called in smkinven are the suspect for this bug.

livestock_2021hb_FEM_daily_test_sorted.csv

@hnqtran
Copy link
Contributor

hnqtran commented Mar 14, 2024

Progress update: (edited for more detail)

The second issue in this ticket related to bogus "Both VOC|TOG and toxics are not found" errors was found related to treatment of daylight saving time in the subroutine $SMOKE_HOME/src/smkinven/rdff10pd.f which reads data from daily specific emission files.

In brief, the sequential processes in rdff10pd.f are as follow:

  • Read in line by line the content of input ff10 daily emission input file.
  • For each data line, determines FIPS code, SCC and pollutant ID, find index (COD) of this pollutant ID in the list of input pollutant and find index CIDX of this pollutant ID in the emission inventory table. For each unique combination of FIPS, SCC and COD, increment the number of concatenated source (S) by 1.
  • For each day of the month in this data line (March is the month in this case), calculate hourly emission and record data for next step.

Issue happens within this block of code between line 899 - 923

                H = 0
                DO T = PTR, MIN( PTR + 23, NSTEPS )

                    H = H + 1
                    NPDPT( T ) = NPDPT( T ) + 1

                    HS = NPDPT( T )

                    IF( HS .LE. MXPDSRC ) THEN

                        IDXSRC( HS,T ) = HS
                        SPDIDA( HS,T ) = S
                        CIDXA ( HS,T ) = CIDX
                        CODEA ( HS,T ) = COD
                        IF( CEMPOL ) THEN
                            EMISVA( HS,T ) = CONVFAC * EMIS(S,COD) * TDAT( D,H )  ! Store data in emissions
                            DYTOTA( HS,T ) = CONVFAC * EMIS(S,COD) * TOTAL
                        ELSE
                            EMISVA( HS,T ) = CONVFAC * TDAT( D,H )  ! Store data in emissions
                            DYTOTA( HS,T ) = CONVFAC * TOTAL
                        END IF

                    END IF

                END DO

Assuming that we have a simplified daily emission input file for a single FIPS 01005 with single SCC 2805035000 with four pollutant NH3, BENZENE, TOLUENE, VOC. In this case, S = 1, max number of sources MXPDSRC = 4 (4 unique combinations of FIPS/SCC/COD).

If we write out values of all variables in the above loop for debug, it would look like the following when it read to the last day of March (Time step = T =768):

FIPS= 01005; SCC= 2805035000; NH3; T= 768; NPDPT(T)= 1; HS= 1; S = 1; CIDX= 695; COD= 1; EMISVA= 0.851D-01

When rdff10pd.f read next data for April, and because March was marked for daylight saving time, rdff10pd.f rewrite data of last hour in March with first hour in April (in local time, which is 05:00 UTC for this FIPS), and the debug line for this hour will look as the following:

FIPS= 01005; SCC= 2805035000; NH3; T= 768; NPDPT(T)= 2; HS = 2; S = 1; CIDX= 695; COD= 1; EMISVA= -0.100D+38

Note that while rewriting data for T = 768, rdff10pd.f also mistakenly increment value of NPDPT(T) and HS. This means for every pollutant, HS value at this time step increased by 2 instead of 1. As rdff10pd.f finished reading data for BENZENE at this time step, HS = 4. When it got to reading data for VOC at this time step, HS = 7 then 8. However, rdff10pd.f only recorded data when HS <= MXPDSRC and so any data associated with HS > 4 will be ignored. Therefore VOC and TOLUENE were not recorded for this time step T = 768. Consequently, smkinven aborted in the subsequent module with error "Found toxics but no VOC|TOG found for the source" (it found BENZENE but could not found VOC; also could not found TOLUENE but this does not cause the crash).

Depending on the order of pollutant appearance in the daily input file, smkinven could crash or not with different message:

  • If order is NH3, VOC, BENZENE, TOLUENE: crashes with "Found VOC|TOG but no toxics found" (BENZENE and TOLUENE are missing)
  • If order is VOC, BENZENE, TOLUENE, NH3: will not crash but give out warning "Data missing for TOLUENE" and NH3.

Note that the above issue is only applicable to last time step in March and to counties that observe daylight saving time (DST). There is no issue with data in other time step and in counties that do not observe DST.

@hnqtran
Copy link
Contributor

hnqtran commented Mar 19, 2024

Update (Continue)

Issue with daylight saving time also extends to November where in county observing DST, data for the first hour of December was entirely skipped. The skipped hour, however, does not cause smkinven to crash.

This issue may have been going on for long time and unless it cause smkinven to crash as reported by Christine in this ticket, it was hard to be detected. It only affects the last hour in March or first hour in December, and when we do QA/QC with SMOKE report file which often at annual scale, the difference between input and output may only be subtle.

Fix for this issue:

  • Introducing new variable to keep track of HS value for each combination of source
INTEGER, ALLOCATABLE :: HSVAL(:,:,:)   ! HS value record keeper
IF (.NOT. GETSIZES ) THEN
            IF( ALLOCATED( HSVAL ) ) DEALLOCATE( HSVAL )
            ALLOCATE( HSVAL(NSRC,NIPPA,NSTEPS), STAT=IOS )
            CALL CHECKMEM( IOS, 'HSVAL', PROGNAME )
            HSVAL  = 0
END IF
                H = 0
                DO T = PTR, MIN( PTR + 23, NSTEPS )

                    H = H + 1

c                 Processing for Day light saving start                    
                   IF ( HSVAL(S,COD,T) .NE. 0 ) THEN ! If this combination of source/ pol/timestep was already processed
                       HS = HSVAL(S,COD,T)           ! Recover recorded HS value

c                     Just update emission values and move on
                       IF( HS .LE. MXPDSRC ) THEN

                           IF( CEMPOL ) THEN               
                                EMISVA( HS,T ) = CONVFAC * EMIS(S,COD) * TDAT( D,H )  ! Store data in emissions
                                DYTOTA( HS,T ) = CONVFAC * EMIS(S,COD) * TOTAL
                          ELSE
                                EMISVA( HS,T ) = CONVFAC * TDAT( D,H )  ! Store data in emissions
                                DYTOTA( HS,T ) = CONVFAC * TOTAL
                          END IF 
                      END IF
                            
                      CYCLE
                   END IF

c                 Processing for when Day light saving end;
                   IF ( T .GT. 2 ) THEN
                      IF ( HSVAL(S,COD,T-1) .EQ. 0 ) THEN ! No data had been read for the previous hour of this combo
                         NPDPT(T-1) = NPDPT(T-1) + 1
                         HS = NPDPT(T-1)
                            
                         IF( HS .LE. MXPDSRC ) THEN

                                IDXSRC( HS,T-1 ) = HS
                                SPDIDA( HS,T-1 ) = S
                                CIDXA ( HS,T-1 ) = CIDX
                                CODEA ( HS,T-1 ) = COD
                                IF( CEMPOL ) THEN
                                    EMISVA( HS,T-1 ) = CONVFAC * EMIS(S,COD) * TDAT( D,H )  ! Store data in emissions
                                    DYTOTA( HS,T-1 ) = CONVFAC * EMIS(S,COD) * TOTAL
                                ELSE
                                    EMISVA( HS,T-1 ) = CONVFAC * TDAT( D,H )  ! Store data in emissions
                                    DYTOTA( HS,T-1 ) = CONVFAC * TOTAL
                                END IF
                        END IF   

                        HSVAL(S,COD,T-1) = HS  
                      END IF
                    END IF 
                    
                    NPDPT( T ) = NPDPT( T ) + 1
                    HS = NPDPT( T )

                    IF( HS .LE. MXPDSRC ) THEN

                        IDXSRC( HS,T ) = HS
                        SPDIDA( HS,T ) = S
                        CIDXA ( HS,T ) = CIDX
                        CODEA ( HS,T ) = COD
                        IF( CEMPOL ) THEN
                            EMISVA( HS,T ) = CONVFAC * EMIS(S,COD) * TDAT( D,H )  ! Store data in emissions
                            DYTOTA( HS,T ) = CONVFAC * EMIS(S,COD) * TOTAL
                        ELSE
                            EMISVA( HS,T ) = CONVFAC * TDAT( D,H )  ! Store data in emissions
                            DYTOTA( HS,T ) = CONVFAC * TOTAL
                        END IF

                    END IF

                    HSVAL(S,COD,T) = HS    ! Record HS source index for this combination of source, pol, and timestep

                END DO  

@callen52
Copy link
Author

I've confirmed this has been fixed.

@callen52
Copy link
Author

Re-opening this issue due to a similar bug that has popped up in SMOKE 5.1 (this was not an issue in SMOKE 5.0). For several sectors in which we use daily FF10 inventories, Smkinven is giving bogus "Found VOC|TOG but no toxics found for the source" or "Both VOC|TOG and toxics are not found for the source" errors for sources which have both VOC and toxics.

This attachment includes sample inventories (PTINV and PTDAY), sample run scripts, and a Smkinven log which shows the errors, for the ptagfire sector. This sector produces incorrect "Found VOC|TOG but no toxics found" error messages with SMOKE 5.1 (not with SMOKE 5.0).
ptagfire_daily_test_case_for_github_10jul2024.zip

We are also occasionally seeing these errors with the livestock sector again, but I am having a harder time nailing down a test case for livestock - there must be only very specific circumstances which trigger the error for that sector. I'm hoping that whatever fixes ag fires would also fix any lingering issues with livestock.

@callen52 callen52 reopened this Jul 10, 2024
@hnqtran
Copy link
Contributor

hnqtran commented Jul 11, 2024

@callen52 , I found the ptagfire_daily_test_case_for_github_10jul2024.zip uses auxiliary inputs from 2022 platform. Would you be able to share those input as well or point me to where I can get them?

@callen52
Copy link
Author

invtable_2022platform_integrate_03jun2024_v0.txt
costcy_for_2017platform_20dec2023_v8.txt

Here are the COSTCY and INVTABLE. You can skip Spcmat, Grdmat, Temporal, etc for this test.

@hnqtran
Copy link
Contributor

hnqtran commented Jul 18, 2024

The new issue related to processing ptagfire was found to cause by the new code block implemented to process emission when daylight saving end (starting at line 933 /src/smkinve/rdff10pd.f)

c      Processing for when Day light saving end
        IF (T .GT. 2 ) THEN
             IF ( HSVAL(S,COD,T-1) .EQ. 0) THEN ! No data had been read for previous time step of this combo
                   NPDPT(T-1) = NPDPT(T-1) + 1
                   HS = NPDPT(T-1)

This issue occurs if data in ptday emission input is not in chronological order which is the case for test input file ptday_agburn_2022v1_conus_filtered_02jul2024_v0.AL. This bug was not observed in earlier bug-fix checks with NEI 2021 emission platform inputs since the ptday* emisison input are in chronological order.

Fix for this bug is narrowing the conditions for processing emission when day light saving end

c     Processing for when Day light saving end
       IF (T .GT. 24 ) THEN
            IF ( HSVAL(S,COD,T-1) .EQ. 0
     &       .AND. HSVAL(S,COD,T-2) .GT. 0 ) THEN ! No data had been read for the previous hour of this combo but was read for the hour before that
                  NPDPT(T-1) = NPDPT(T-1) + 1
                  HS = NPDPT(T-1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants