Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix subdaily to daily ( cut for time) #202

Merged
merged 17 commits into from
May 6, 2020
Merged

Conversation

weiyuan-jiang
Copy link
Contributor

@weiyuan-jiang weiyuan-jiang commented Apr 30, 2020

This pull request addresses a post-processing bug that was introduced in the concatenation of subdaily into daily nc4 files, see #176.
The bug impacted pre-releases v17.9.0-beta.4-SLES[xx].
Also addresses writing of ObsFcstAna and smapL4SMaup binary files into ./scratch and move into ana/ens_avg/year/month directories during post-proc, see #205.

@weiyuan-jiang weiyuan-jiang requested a review from a team as a code owner April 30, 2020 20:03
…C_HIST

- added check for number of sub-daily nc4 files before concatenation into daily nc4 file
- minor cleanup
- fixed white space (indentation)
@gmao-rreichle
Copy link
Contributor

I committed the following changes directly onto the branch "bugfix/sub_daily":

  1. Renamed and clarified resource parameter MONTHLY_OUTPUT --> POSTPROC_HIST
  2. Added check for number of sub-daily nc4 files before concatenation into daily nc4 file
  3. Added comments and minor cleanup

I'm pretty sure item 2. is needed because otherwise an incomplete number of sub-daily files might be concatenated into a daily nc4 file if the job segment does not include complete days.

I successfully ran the "conus" and "global" test cases.

@gmao-qliu @smahanam : Please review the changes carefully, test with your applications, and let me know if your tests work out ok.

@weiyuan-jiang : Please carefully review the scripting changes.

@weiyuan-jiang
Copy link
Contributor Author

I still feel there is a problem. For example, with your changes, what if the program finishes after it produces only one file at the last hour of the month? Originally, it keeps concatenating for every hour. There is a problem for that approach too. I think it may be better to add "daily" into the name we created to concatenate the hourly file. Of course, we need to consider the situation when the file is already daily.

@gmao-rreichle
Copy link
Contributor

@weiyuan-jiang : I think it is safer to concatenate only if the day is complete. If there is only a single file at the end of the month, the user (probably?) doesn't expect to get a concatenated file for that last day of the month. (I assume that if a previous segment provided hourly output for the times prior to that last hour, the post-proc will work as intended.)

My concern was more with the situation where GEOSldas is run in 6-hourly segments (as may be needed, e.g., in near-real time ops). In that case, I think the old code would have created a daily concatenated file after each 6-h segment. On the last day of the month, I think, the monthly avg would have kicked in at 6z, a wrong monthly-mean file would have been created, and the remainder of the day's sub-daily output would not have been concatenated into the last day's daily file.

I'm reluctant to add "daily" to the concatenated file because this would change the output specs, and I don't think it does anything to address the issue of running the concatenation because the time stamp format for the concatenated files already identifies them uniquely. Besides, "daily" then sounds like a daily-mean file, in analogy to "monthly", which is a monthly-mean file.

In a nutshell, I don't think the post-processing was nearly as well thought out as it should have been. Originally, it was narrowly focused on the use case where each segment consisted of full months. We had an issue early on when monthly-mean files were generated even though months weren't complete. My fixes today are an attempt to avoid repeating this mistake with the subdaily-to-daily concatenation. In the long run, it might be cleaner to fully decouple the subdaily-to-daily concatenation and the monthly-mean generation. For now, I still think what we have should be ok except for odd use cases.

I might well be missing something, though... So please do keep commenting if anything looks wrong.

@gmao-qliu
Copy link
Contributor

@gmao-rreichle @weiyuan-jiang I tested this bug fix with various scenarios. Works in all.

weiyuan-jiang and others added 7 commits May 5, 2020 20:30
- writing ObsFcstAna and smapL4SMaup files into scratch then move to ana/ens_avg/year/month dir in postprocessing
- fixes to subdaily-to-daily concatentation and monthly average
…istent with LDAS.rc

additional fix to writing ObsFcstAna and smapL4SMaup files into ./scratch
@gmao-rreichle
Copy link
Contributor

Further edits are in branch bugfix/sub_daily_RR (finalized ~noon on 6 May 2020)
@weiyuan-jiang : Please review, edit as needed, and merge into branch bugfix/sub_daily

@weiyuan-jiang
Copy link
Contributor Author

It has been tested (including .bin files)

@weiyuan-jiang
Copy link
Contributor Author

After merge, I will inform Matt to update the test scripts.

@gmao-rreichle gmao-rreichle merged commit ae0a779 into develop May 6, 2020
@gmao-rreichle gmao-rreichle deleted the bugfix/sub_daily branch May 6, 2020 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants