Developing Guidance & Documentation for clim-recal #42

dingaaling · 2023-08-03T15:29:26Z

https://squidfunk.github.io/mkdocs-material/
@RuthBowyer could you please add some links to the Markdown files here?
Guidance: what you should do
Documentation: what you did

Plan:

create vision for clean pipeline (see commit 89b70a5)
added sphinx base for future documentation to branch extend_documentation
split existing content into main pipeline walk-through (visible in readme) and internal document
create "narrative walk-through" for pipeline
still to do
change all paths to be either azure specific or dummy general purpose
change all steps to use just one metric, city and run as example
add Griff's azure doc to INTERNAL.md
add contributors
fix section links

The text was updated successfully, but these errors were encountered:

aranas · 2023-08-10T11:39:02Z

Before we go into creating a nice looking website using mkdocs, I would like to start by mapping out the individual elements in terms of guidance & documentation more closely by reworking the README. I suggest that the README contains some guidance (and links to further info where it gets too deep) and this guidance should be clearly split into different user groups, eg non-climate scientist and more expert researchers, because they will have different goals when interacting with the project. Here is a draft, maybe we can discuss this later today together

Structure

Intro (what is it, for whom)
ToC
Quick start guide (download repo, install dependencies & run small-scale example)
small-scale example as notebook
Guidance
- For non-climate scientists (why BC, which BCs brief taxonomy viz, how to decide/flowchart)
- For expert researchers (detailed technical guides, eg code examples & BC tutorials; how to contribute)
Documentation (divided into python & R pipelines?)
- Installation & setup
- Where to get data from & data format (eventually MO open data portal)
- functions docstrings
- FAQs
Research (review, references)
License & contributors

Resources for good README

Turing Way best practices on Landing Page
example projects with guidance and documentation in readme

aranas · 2023-08-10T12:32:32Z

The part about downloading data from zure is only for internal info right? Or will this be outward facing info?
https://github.com/alan-turing-institute/clim-recal#accessing-the-pre-downloadedpre-processed-data

gmingas · 2023-08-10T12:46:47Z

The part about downloading data from zure is only for internal info right? Or will this be outward facing info? https://github.com/alan-turing-institute/clim-recal#accessing-the-pre-downloadedpre-processed-data

Yes, this is for internal info and could be moved to another readme file or somewhere else, possibly with a link to it from the main readme.

RuthBowyer · 2023-08-10T14:12:37Z

I think this could be useful for our partners if this is how we share the data with them (which I think is still tbc?) jic you were unaware (see issues 37 and 38 )

dingaaling · 2023-08-13T09:53:13Z

The overview you've drafted looks great, @aranas!

Two additional example resources I'd add as README reference points for us are BIG-bench and EleutherAI's lm-evaluation-harness. These are two examples of creating standardised resources for evaluating and comparing LLMs on a range of tasks. I think BIG-bench is better documented atm and probably the better source of inspo for us, but the eval-harness is also going through a major refactor atm.

Beyond the README, another useful reference point is how they document tasks in a summary table for Big-Bench and task-table for the eval-harness. I recommend we add that as a priority so we (and any users!) have a standard map/naming we can use to refer to our different BC methods.

aranas · 2023-09-12T08:26:05Z

another example of a benchmark style repo but closer to home: https://github.com/duncanwp/ClimateBench

dingaaling · 2023-09-21T15:45:35Z

@gmingas prioritisation feedback: Guidance (e.g. step by step) of how to use the pipeline via CLI or notebook

@RuthBowyer feedback: we're still trying to figure out who the users are

@aranas "narrative around the pipeline"

aranas · 2023-10-11T14:43:51Z

@RuthBowyer atm according to the anlysis flowchart the Cropping_Rasters_to_three_cities.R script takes the resampled files and extracts data for three cities before passing on to further preprocessing (splitting into test & validate and eventually applying bias correction).
For me to include this into the pipeline walk-through could you provide the relevant R specs, e.g version, environment files specifying packages?

aranas · 2023-10-11T14:47:01Z

@RuthBowyer, I am wondering should we host the shapefiles on the GitHub repo to make this more accessible? they don't seem very big.

Else, I will need the source for this shapefile:
NUTS_Level_1_January_2018_FCB_in_the_United_Kingdom_2022_7279368953270783580 -- this shapefile used for defining regions and cutting, also London -- this one also used for chopping up LCAT data

aranas · 2023-10-11T16:18:31Z

For the analysis walk-through I will provide shell commands to execute full pipeline end to end for one. I think it would suffice to illustrate this with one metric, one city, one run, rather than including the loops.
Would you agree or should the walk-through include the loops?

gmingas · 2023-10-11T17:40:41Z

For the analysis walk-through I will provide shell commands to execute full pipeline end to end for one. I think it would suffice to illustrate this with one metric, one city, one run, rather than including the loops. Would you agree or should the walk-through include the loops?

Totally agree, just one combination is enough. And the script with the loops will be available in the codebase too.

gmingas · 2023-10-11T17:41:21Z

@RuthBowyer, I am wondering should we host the shapefiles on the GitHub repo to make this more accessible? they don't seem very big.

Else, I will need the source for this shapefile: NUTS_Level_1_January_2018_FCB_in_the_United_Kingdom_2022_7279368953270783580 -- this shapefile used for defining regions and cutting, also London -- this one also used for chopping up LCAT data

I would support including them in the repo if there are no licensing issues.

RuthBowyer · 2023-10-12T08:51:54Z

Yep sounds good - all downloaded from OA sources but might need to double check the licenses on the sites

griff-rees · 2023-10-12T15:25:43Z

I've added ticket #42 for configuring how the documentation is rendered and maintained.

griff-rees · 2023-10-17T18:29:25Z

I've added some screenshots from using quarto in #42. Great to get a sense if people like that option (and sorry I think I commented on #56 thinking it was this one, my bad).

aranas · 2023-10-18T19:31:57Z

Yep sounds good - all downloaded from OA sources but might need to double check the licenses on the sites

I will create a new issue for this @RuthBowyer

aranas · 2023-10-18T19:45:54Z

I think I have now written all the sections that I wanted to complete, so I will open PR #62 for review. Please feel free to comment / fix / close while I am on A/L this week.

Some open questions / comments from my side:

R version specs need to be added to requirements
(line 123 in guidance) currently R script logic at this point quite different from python scripts, as all paths and variables are hardcoded in script. Either I adjust this in the guidance by instructing to either change paths in R script or paste R code with dummy paths directly into guidance or we adjust the Cropping R script.
- for simplicity in guidance I chose to remove some flags that had defaults and were not crucial conceptually for the analysis (eg multiprocessing option), but feel free to add those back in if you think they are needed

gmingas · 2023-10-26T15:19:20Z

As discussed today: @RuthBowyer when you are back could you please have a look at this and add the documentation parts relevant to the R pipeline e.g. R packages list. Recommendation is to do this in a PR separate from #62

gmingas · 2023-10-26T15:27:07Z

@aranas to talk to @griff-rees to decide on easiest approach for merging the quarto and guidance branches (quarto branch already has merged main branch, which was challenging)

griff-rees · 2023-10-30T20:45:56Z

@gmingas the merge was managed in #72

dingaaling assigned aranas Aug 3, 2023

dingaaling mentioned this issue Aug 15, 2023

Set up Read the Docs/Hugo site for RAM resources alan-turing-institute/research-application-management#48

Open

1 task

aranas added this to the 31 Aug - 14 Sept Sprint milestone Aug 31, 2023

aranas modified the milestones: 31 Aug - 14 Sept Sprint, 14 Sept - 30 Sept Sep 21, 2023

gmingas modified the milestones: 14 Sept - 30 Sept, 30 Sept - 12 Oct Sep 28, 2023

griff-rees mentioned this issue Oct 12, 2023

Documentation configuration #66

Closed

4 tasks

aranas mentioned this issue Oct 18, 2023

adding shapefiles to repo #68

Closed

4 tasks

griff-rees modified the milestones: 30 Sept - 12 Oct, 12 Oct - 2 Nov Oct 19, 2023

gmingas modified the milestones: 12 Oct - 26 Oct, 27 Oct - 9 Nov Oct 26, 2023

gmingas assigned RuthBowyer Oct 26, 2023

gmingas closed this as completed Nov 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Developing Guidance & Documentation for clim-recal #42

Developing Guidance & Documentation for clim-recal #42

dingaaling commented Aug 3, 2023 •

edited by aranas

aranas commented Aug 10, 2023 •

edited

aranas commented Aug 10, 2023

gmingas commented Aug 10, 2023

RuthBowyer commented Aug 10, 2023 •

edited

dingaaling commented Aug 13, 2023

aranas commented Sep 12, 2023

dingaaling commented Sep 21, 2023 •

edited

aranas commented Oct 11, 2023

aranas commented Oct 11, 2023 •

edited

aranas commented Oct 11, 2023

gmingas commented Oct 11, 2023

gmingas commented Oct 11, 2023

RuthBowyer commented Oct 12, 2023

griff-rees commented Oct 12, 2023

griff-rees commented Oct 17, 2023 •

edited

aranas commented Oct 18, 2023

aranas commented Oct 18, 2023 •

edited

gmingas commented Oct 26, 2023 •

edited

gmingas commented Oct 26, 2023

griff-rees commented Oct 30, 2023

Developing Guidance & Documentation for clim-recal #42

Developing Guidance & Documentation for clim-recal #42

Comments

dingaaling commented Aug 3, 2023 • edited by aranas

aranas commented Aug 10, 2023 • edited

Structure

Resources for good README

aranas commented Aug 10, 2023

gmingas commented Aug 10, 2023

RuthBowyer commented Aug 10, 2023 • edited

dingaaling commented Aug 13, 2023

aranas commented Sep 12, 2023

dingaaling commented Sep 21, 2023 • edited

aranas commented Oct 11, 2023

aranas commented Oct 11, 2023 • edited

aranas commented Oct 11, 2023

gmingas commented Oct 11, 2023

gmingas commented Oct 11, 2023

RuthBowyer commented Oct 12, 2023

griff-rees commented Oct 12, 2023

griff-rees commented Oct 17, 2023 • edited

aranas commented Oct 18, 2023

aranas commented Oct 18, 2023 • edited

gmingas commented Oct 26, 2023 • edited

gmingas commented Oct 26, 2023

griff-rees commented Oct 30, 2023

dingaaling commented Aug 3, 2023 •

edited by aranas

aranas commented Aug 10, 2023 •

edited

RuthBowyer commented Aug 10, 2023 •

edited

dingaaling commented Sep 21, 2023 •

edited

aranas commented Oct 11, 2023 •

edited

griff-rees commented Oct 17, 2023 •

edited

aranas commented Oct 18, 2023 •

edited

gmingas commented Oct 26, 2023 •

edited