New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ARD overpass notebook + supplementary data #736
Conversation
Supplementary data for ARD_overpass_predictor notebook in dea-notebooks/Frequently_used_code/
Hi Eric, thanks for your PR. I can't seem to find the ARD overpass notebook in question? Could you please provide a link? Thanks. |
Sorry it’s coming I swear!
I had to edit the path in the notebook to specify this example file and was just testing that in the Sandbox. It all works now so will be submitting the notebook shortly.
Thanks
From: Matthew Alger <notifications@github.com>
Sent: Monday, 7 December 2020 10:47 AM
To: GeoscienceAustralia/dea-notebooks <dea-notebooks@noreply.github.com>
Cc: Hay Eric <Eric.Hay@ga.gov.au>; Author <author@noreply.github.com>
Subject: Re: [GeoscienceAustralia/dea-notebooks] Add input dataset for ARD overpass notebook (#736)
Hi Eric, thanks for your PR. I can't seem to find the ARD overpass notebook in question?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#736 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANJD6OJ3IYKT4L7LUTKPZTLSTQJYHANCNFSM4UPT24NA>.
Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks.
|
Great, looking forward to it :) Just add it to this PR and we can look at the notebook + data at the same time. |
Hey I have added the notebook! |
Tidied up the fist MD cell with a link to DEA Sandbox, and the DEA image |
@@ -0,0 +1,1677 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can assume that the user will change their input file if they want to analyse somewhere else. Instead of telling the user to change the file, explain how the file is formatted (which you've done below anyway) and they can make that change if they want. So I reckon remove the "Caution" line.
The input file is csv
now right, not xlsx
?
I don't quite understand this line, as I think I should be able to run the notebook without making nchanges:
Make changes to the notebook, following the Steps in bold
Load packages shouldn't be a subsection of description, it should be in a getting started section - check the template and make sure you're matching how it is formatted and structured.
Reply via ReviewNB
@@ -0,0 +1,1677 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the secondary overpasses not already in datetime format? I think you can get pd.read_csv
to force them all into datetime format automatically.
What is a secondary overpass and why would I want to have one in my input? Could you please add a little explanation?
Don't use os.chdir
as it has a tendency to make the rest of the notebook harder to understand and may break existing scripts. Read the file using a relative path instead. You also can't assume that this notebook is in jovyan/
(e.g. I am testing it in a different place!) so try something like ../Supplementary_data/ARD_overpass_predictor/overpass_input.csv
instead.
Input file looks good!
Reply via ReviewNB
@@ -0,0 +1,1677 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,1677 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the significance of 20? I'm a bit confused as to what this is doing. I think it's finding the next 20 times Landsat will be overhead? Please make this a bit clearer.
Reply via ReviewNB
@@ -0,0 +1,1677 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,1677 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is evident from the input file, then we should be able to automatically extract it from the input file. Please do that instead of having to edit the notebook if at all possible.
Also, overpass_input.csv
?
Reply via ReviewNB
@@ -0,0 +1,1677 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the datestep means. Is it a meaningful value? If not, maybe we could just give the rows an index?
Reply via ReviewNB
@@ -0,0 +1,1677 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can output all of the field sites that were in the input? That would make it easier.
Reply via ReviewNB
@@ -0,0 +1,1677 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,1677 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good idea to have the output filename at the top of the notebook along with other analysis parameters like the input name.
Reply via ReviewNB
This is a really useful notebook @EricHay, and it'll be a great addition to Frequently_used_code! A few comments so far that I've posted on ReviewNB, and a few I'll post here. My main request is that you add more documentation. This looks like a useful tool, but |
Great thanks Matthew. I sure can fix up the documentation and code. I did this a while back as a learning exercise and it is a pretty convoluted process! |
Hey @EricHay, notebook looks awesome! To echo some of what Matthew wrote above, I think the key changes that are needed are:
Happy to help out with any of this stuff! It's looking great through, and will be an excellent addition to the repo. 🚀 |
Thanks @robbibt, sorry i'm a bit flat out today with meetings but will get to polishing this off soon. Agree on both points 👍 and thanks for the feedback! |
Take your time, no rush :) Also feel free to Slack me (or Robbi probably?) if you need any help figuring out how to document it. |
Ok I have done some major tidying. I have simplified things A LOT! It is actually kinda fun to re-visit old notebooks and fix them up.
The notebook is now pretty much automatic. No more specifying which sites to order by etc, and will merge using pd.concat instead of pd.merge which was messy. You can specify an output directory / file name at the top of the notebook. I have also simplified the input file to use 3 sites as an example, and added hashed out lines where you can add extra sites.
The only comment I couldn't really address was about repitition around the looping for different satellites. I am not sure how to "loop loops" :p I combined these all into one cell and tried to explain it a bit better. Let me know if I should change anything further! |
Sorry I forgot to remove some text referring to the old Process. Should be good now! |
@@ -0,0 +1,864 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These commented out bits make the code a little harder to follow:
#Sentinel_2A = Sentinel_2A + datetime.timedelta(hours=10) #convert to local time (Aus eastern standard time) = utc + 10 hours #Sentinel_2A ### to AEDT, add 11h not 10 ###
I think it might be better to simplify the code by removing them from here and instead make it really clear in the Combine dataset
bit that the times are in UTC (?) (and possibly give an example of how to convert the columns in the final table there).
Reply via ReviewNB
@@ -0,0 +1,864 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: can we remove the commented out lines here? Anyone who wants to look at the data can always edit and add them in themselves
#S2B_combined
Reply via ReviewNB
@@ -0,0 +1,864 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This table looks really nice and useful! The "Site" title for the first column is a tiny bit confusing (perhaps it should be "Overpass"?) but that's a super minor thing and probably not worth updating.
Reply via ReviewNB
Hey @EricHay , I just posted a few very minor comments above, it's looking great! Thanks for all your work in updating this, I think it should be much easier to use now. |
Thanks @robbibt. Easy enough fixes, I agree on all points. This has been a bit of a backburner project, and I didn't even notice the "Site" column in the final table! That can definitely be dropped. I will do another quick tidy, and comment again once done. |
Ok, tidied up the unnecessary commented-out code in cells, updated the final table with "Overpass" as the index, and added optional code in the Combine Dataset section to add / alter time zones, with UTC to AEST as an example (+10h). I think this is straight-forward enough for an average person with some Python knowledge to utilise now :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, did I forget to accept this? I think this looks much nicer and easier to use/follow, thanks for putting in these changes!
When you're happy to merge, select the "squash and merge" option below :)
* Add files via upload Supplementary data for ARD_overpass_predictor notebook in dea-notebooks/Frequently_used_code/ * Add files via upload * Add files via upload * Add files via upload * Add files via upload * Add files via upload * Add files via upload * Add files via upload
Supplementary data for ARD_overpass_predictor notebook in dea-notebooks/Frequently_used_code/
Proposed changes
Supplementary data for ARD_overpass_predictor notebook in dea-notebooks/Frequently_used_code/
Checklist (replace
[ ]
with[x]
to check off)Load packages
General advice
)jupyterlab_code_formatter
tool can be used to format code cells to a consistent style: select each code cell, then clickEdit
and then one of theApply X Formatter
options (YAPF
orBlack
are recommended).NCI
andDEA Sandbox
(flag if not working as part of PR and ask for help to solve if needed)Notebook currently compatible with the NCI|DEA Sandbox environment only
line below the notebook title to reflect the environments the notebook is compatible with