Add ARD overpass notebook + supplementary data #736

Eric-git-999 · 2020-12-06T21:44:13Z

Supplementary data for ARD_overpass_predictor notebook in dea-notebooks/Frequently_used_code/

Proposed changes

Supplementary data for ARD_overpass_predictor notebook in dea-notebooks/Frequently_used_code/

Checklist (replace `[ ]` with `[x]` to check off)

Notebook created using the DEA-notebooks template
Remove any unused Python packages from Load packages
Remove any unused/empty code cells
Remove any guidance cells (e.g. General advice)
Ensure that all code cells follow the PEP8 standard for code. The jupyterlab_code_formatter tool can be used to format code cells to a consistent style: select each code cell, then click Edit and then one of the Apply X Formatter options (YAPF or Black are recommended).
Include relevant tags in the final notebook cell (refer to the DEA Tags Index, and re-use tags if possible)
Clear all outputs, run notebook from start to finish, and save the notebook in the state where all cells have been sequentially evaluated
Test notebook on both the NCI and DEA Sandbox (flag if not working as part of PR and ask for help to solve if needed)
If applicable, update the Notebook currently compatible with the NCI|DEA Sandbox environment only line below the notebook title to reflect the environments the notebook is compatible with

Supplementary data for ARD_overpass_predictor notebook in dea-notebooks/Frequently_used_code/

MatthewJA · 2020-12-06T23:47:03Z

Hi Eric, thanks for your PR. I can't seem to find the ARD overpass notebook in question? Could you please provide a link? Thanks.

Eric-git-999 · 2020-12-06T23:51:08Z

Sorry it’s coming I swear! I had to edit the path in the notebook to specify this example file and was just testing that in the Sandbox. It all works now so will be submitting the notebook shortly. Thanks From: Matthew Alger <notifications@github.com> Sent: Monday, 7 December 2020 10:47 AM To: GeoscienceAustralia/dea-notebooks <dea-notebooks@noreply.github.com> Cc: Hay Eric <Eric.Hay@ga.gov.au>; Author <author@noreply.github.com> Subject: Re: [GeoscienceAustralia/dea-notebooks] Add input dataset for ARD overpass notebook (#736) Hi Eric, thanks for your PR. I can't seem to find the ARD overpass notebook in question? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#736 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANJD6OJ3IYKT4L7LUTKPZTLSTQJYHANCNFSM4UPT24NA>. Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks.

MatthewJA · 2020-12-06T23:52:27Z

Great, looking forward to it :) Just add it to this PR and we can look at the notebook + data at the same time.

Eric-git-999 · 2020-12-07T00:01:53Z

Hey I have added the notebook!

Eric-git-999 · 2020-12-07T00:30:28Z

Tidied up the fist MD cell with a link to DEA Sandbox, and the DEA image

MatthewJA · 2020-12-07T03:08:09Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,1677 @@
+{


I think you can assume that the user will change their input file if they want to analyse somewhere else. Instead of telling the user to change the file, explain how the file is formatted (which you've done below anyway) and they can make that change if they want. So I reckon remove the "Caution" line.

The input file is csv now right, not xlsx ?

I don't quite understand this line, as I think I should be able to run the notebook without making nchanges:
Make changes to the notebook, following the Steps in bold

Load packages shouldn't be a subsection of description, it should be in a getting started section - check the template and make sure you're matching how it is formatted and structured.

Reply via ReviewNB

MatthewJA · 2020-12-07T03:08:09Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,1677 @@
+{


Why are the secondary overpasses not already in datetime format? I think you can get pd.read_csv to force them all into datetime format automatically.

What is a secondary overpass and why would I want to have one in my input? Could you please add a little explanation?

Don't use os.chdir as it has a tendency to make the rest of the notebook harder to understand and may break existing scripts. Read the file using a relative path instead. You also can't assume that this notebook is in jovyan/ (e.g. I am testing it in a different place!) so try something like ../Supplementary_data/ARD_overpass_predictor/overpass_input.csv instead.

Input file looks good!

Reply via ReviewNB

MatthewJA · 2020-12-07T03:08:10Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,1677 @@
+{


If there's no reason not to use the more accurate timestamps, let's use those.

Reply via ReviewNB

MatthewJA · 2020-12-07T03:08:10Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,1677 @@
+{


What's the significance of 20? I'm a bit confused as to what this is doing. I think it's finding the next 20 times Landsat will be overhead? Please make this a bit clearer.

Reply via ReviewNB

MatthewJA · 2020-12-07T03:08:10Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,1677 @@
+{


Again, why 32?

Reply via ReviewNB

MatthewJA · 2020-12-07T03:08:10Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,1677 @@
+{


If this is evident from the input file, then we should be able to automatically extract it from the input file. Please do that instead of having to edit the notebook if at all possible.

Also, overpass_input.csv?

Reply via ReviewNB

MatthewJA · 2020-12-07T03:08:10Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,1677 @@
+{


I'm not sure what the datestep means. Is it a meaningful value? If not, maybe we could just give the rows an index?

Reply via ReviewNB

MatthewJA · 2020-12-07T03:08:10Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,1677 @@
+{


Maybe we can output all of the field sites that were in the input? That would make it easier.

Reply via ReviewNB

MatthewJA · 2020-12-07T03:08:11Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,1677 @@
+{


to_csv rather than to_excel

Reply via ReviewNB

MatthewJA · 2020-12-07T03:08:11Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,1677 @@
+{


It's a good idea to have the output filename at the top of the notebook along with other analysis parameters like the input name.

Reply via ReviewNB

MatthewJA · 2020-12-07T03:11:38Z

This is a really useful notebook @EricHay, and it'll be a great addition to Frequently_used_code! A few comments so far that I've posted on ReviewNB, and a few I'll post here. My main request is that you add more documentation. This looks like a useful tool, but dea-notebooks is an entry point for much of the DEA environment, so we want all notebooks to be well-explained and detailed so that even beginners can understand them. Take a look at some of the other Frequently_used_code notebooks - that's the level of documentation I'd like to see! Could you please add more markdown cells that explain what you're doing and why so that people without a strong background in the topic can understand what's going on? There's also a lot of repetition going on between the satellites and locations, which could be eliminated by some loops or functions. That'd make the notebook much more easy to edit and work with, and so it'd be much more useful! Thanks :)

Eric-git-999 · 2020-12-07T03:51:25Z

Great thanks Matthew. I sure can fix up the documentation and code. I did this a while back as a learning exercise and it is a pretty convoluted process!

robbibt · 2020-12-07T23:21:58Z

Hey @EricHay, notebook looks awesome! To echo some of what Matthew wrote above, I think the key changes that are needed are:

Extra markdown documentation and explanation from a really basic beginner level, explaining both what each section does, but also why it is needed. A lot of users of dea-notebooks have never dealt with complex code before, so we try and walk them through everything pretty slowly. This is particularly the case for the Frequently_used_code directory which is kind of like a "library" where users can jump in and copy and paste examples of code into their own analyses - we want them to have as good an idea as possible about what each section does so they can re-use and re-purpose the code themselves later on.
Wherever possible, it would be great if all bits of code that require user input be moved up the very top in an "analysis parameters" section (see example here). We typically find that users somewhat blindly run through most of the notebook body without paying close attention to things that need to be changed later in the code, so having all the configurable bits in one single up-front section means they only need to focus on changing one bit rather than having to be on the look out for making changes all the way through in order to get valid results.

Happy to help out with any of this stuff! It's looking great through, and will be an excellent addition to the repo. 🚀

Eric-git-999 · 2020-12-07T23:58:08Z

Thanks @robbibt, sorry i'm a bit flat out today with meetings but will get to polishing this off soon. Agree on both points 👍 and thanks for the feedback!

MatthewJA · 2020-12-08T00:11:45Z

Take your time, no rush :)

Also feel free to Slack me (or Robbi probably?) if you need any help figuring out how to document it.

Eric-git-999 · 2020-12-16T03:12:39Z

Ok I have done some major tidying. I have simplified things A LOT! It is actually kinda fun to re-visit old notebooks and fix them up.

The notebook is now pretty much automatic. No more specifying which sites to order by etc, and will merge using pd.concat instead of pd.merge which was messy. You can specify an output directory / file name at the top of the notebook. I have also simplified the input file to use 3 sites as an example, and added hashed out lines where you can add extra sites.

The only comment I couldn't really address was about repitition around the looping for different satellites. I am not sure how to "loop loops" :p

I combined these all into one cell and tried to explain it a bit better.

Let me know if I should change anything further!

Eric-git-999 · 2020-12-16T03:19:20Z

Sorry I forgot to remove some text referring to the old Process. Should be good now!

robbibt · 2021-01-11T00:41:17Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,864 @@
+{


These commented out bits make the code a little harder to follow:

#Sentinel_2A = Sentinel_2A + datetime.timedelta(hours=10) #convert to local time (Aus eastern standard time) = utc + 10 hours #Sentinel_2A ### to AEDT, add 11h not 10 ###

I think it might be better to simplify the code by removing them from here and instead make it really clear in the Combine dataset bit that the times are in UTC (?) (and possibly give an example of how to convert the columns in the final table there).

Reply via ReviewNB

robbibt · 2021-01-11T00:41:17Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,864 @@
+{


Same as above: can we remove the commented out lines here? Anyone who wants to look at the data can always edit and add them in themselves

#S2B_combined

Reply via ReviewNB

robbibt · 2021-01-11T00:41:17Z

Frequently_used_code/ARD_overpass_predictor.ipynb

@@ -0,0 +1,864 @@
+{


This table looks really nice and useful! The "Site" title for the first column is a tiny bit confusing (perhaps it should be "Overpass"?) but that's a super minor thing and probably not worth updating.

Reply via ReviewNB

robbibt · 2021-01-11T00:42:33Z

Hey @EricHay , I just posted a few very minor comments above, it's looking great! Thanks for all your work in updating this, I think it should be much easier to use now.

Eric-git-999 · 2021-01-11T00:48:57Z

Thanks @robbibt. Easy enough fixes, I agree on all points. This has been a bit of a backburner project, and I didn't even notice the "Site" column in the final table! That can definitely be dropped. I will do another quick tidy, and comment again once done.

Eric-git-999 · 2021-01-11T03:32:11Z

Ok, tidied up the unnecessary commented-out code in cells, updated the final table with "Overpass" as the index, and added optional code in the Combine Dataset section to add / alter time zones, with UTC to AEST as an example (+10h). I think this is straight-forward enough for an average person with some Python knowledge to utilise now :)

robbibt

Whoops, did I forget to accept this? I think this looks much nicer and easier to use/follow, thanks for putting in these changes!

When you're happy to merge, select the "squash and merge" option below :)

* Add files via upload Supplementary data for ARD_overpass_predictor notebook in dea-notebooks/Frequently_used_code/ * Add files via upload * Add files via upload * Add files via upload * Add files via upload * Add files via upload * Add files via upload * Add files via upload

Add files via upload

7148d02

Supplementary data for ARD_overpass_predictor notebook in dea-notebooks/Frequently_used_code/

Eric-git-999 changed the title ~~Add files via upload~~ Add input dataset for ARD overpass predictor notebook Dec 6, 2020

Eric-git-999 changed the title ~~Add input dataset for ARD overpass predictor notebook~~ Add input dataset for ARD overpass notebook Dec 6, 2020

Add files via upload

be6f696

Eric-git-999 changed the title ~~Add input dataset for ARD overpass notebook~~ Add ARD overpass notebook + supplementary data Dec 7, 2020

MatthewJA self-requested a review December 7, 2020 00:08

Add files via upload

32bf952

MatthewJA reviewed Dec 7, 2020

View reviewed changes

Eric-git-999 added 2 commits December 16, 2020 14:11

Add files via upload

9ad0521

Add files via upload

cbcce66

MatthewJA requested review from MatthewJA and robbibt December 16, 2020 03:15

Add files via upload

a1136d4

Add files via upload

4c5f3b2

robbibt reviewed Jan 11, 2021

View reviewed changes

Add files via upload

9efa55e

Eric-git-999 requested a review from robbibt February 2, 2021 21:14

robbibt approved these changes Feb 2, 2021

View reviewed changes

robbibt merged commit f294ec0 into GeoscienceAustralia:develop Feb 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ARD overpass notebook + supplementary data #736

Add ARD overpass notebook + supplementary data #736

Eric-git-999 commented Dec 6, 2020 •

edited

MatthewJA commented Dec 6, 2020 •

edited

Eric-git-999 commented Dec 6, 2020 via email

MatthewJA commented Dec 6, 2020

Eric-git-999 commented Dec 7, 2020

Eric-git-999 commented Dec 7, 2020

MatthewJA Dec 7, 2020

MatthewJA Dec 7, 2020

MatthewJA Dec 7, 2020

MatthewJA Dec 7, 2020

MatthewJA Dec 7, 2020

MatthewJA Dec 7, 2020

MatthewJA Dec 7, 2020

MatthewJA Dec 7, 2020

MatthewJA Dec 7, 2020

MatthewJA Dec 7, 2020

MatthewJA commented Dec 7, 2020

Eric-git-999 commented Dec 7, 2020

robbibt commented Dec 7, 2020 •

edited

Eric-git-999 commented Dec 7, 2020

MatthewJA commented Dec 8, 2020 •

edited

Eric-git-999 commented Dec 16, 2020

Eric-git-999 commented Dec 16, 2020

robbibt Jan 11, 2021

robbibt Jan 11, 2021

robbibt Jan 11, 2021

robbibt commented Jan 11, 2021

Eric-git-999 commented Jan 11, 2021

Eric-git-999 commented Jan 11, 2021

robbibt left a comment •

edited

Add ARD overpass notebook + supplementary data #736

Add ARD overpass notebook + supplementary data #736

Conversation

Eric-git-999 commented Dec 6, 2020 • edited

Proposed changes

Checklist (replace [ ] with [x] to check off)

MatthewJA commented Dec 6, 2020 • edited

Eric-git-999 commented Dec 6, 2020 via email

MatthewJA commented Dec 6, 2020

Eric-git-999 commented Dec 7, 2020

Eric-git-999 commented Dec 7, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MatthewJA commented Dec 7, 2020

Eric-git-999 commented Dec 7, 2020

robbibt commented Dec 7, 2020 • edited

Eric-git-999 commented Dec 7, 2020

MatthewJA commented Dec 8, 2020 • edited

Eric-git-999 commented Dec 16, 2020

Eric-git-999 commented Dec 16, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robbibt commented Jan 11, 2021

Eric-git-999 commented Jan 11, 2021

Eric-git-999 commented Jan 11, 2021

robbibt left a comment • edited

Choose a reason for hiding this comment

Eric-git-999 commented Dec 6, 2020 •

edited

Checklist (replace `[ ]` with `[x]` to check off)

MatthewJA commented Dec 6, 2020 •

edited

robbibt commented Dec 7, 2020 •

edited

MatthewJA commented Dec 8, 2020 •

edited

robbibt left a comment •

edited