Skip to content
This repository has been archived by the owner on Jan 26, 2021. It is now read-only.

Tutorial Data #54

Closed
5 tasks done
kmpaul opened this issue Sep 9, 2019 · 16 comments
Closed
5 tasks done

Tutorial Data #54

kmpaul opened this issue Sep 9, 2019 · 16 comments

Comments

@kmpaul
Copy link
Contributor

kmpaul commented Sep 9, 2019

  • Need to identify the data needed for this tutorial.
  • Find a public place to host data so everyone can download it
  • Need to provide a script to download the data to other machine
  • Enable the configure script to automatically download data (if not on Cheyenne or Casper)
  • If on Cheyenne or Casper, provide soft-links to data
@jukent
Copy link
Contributor

jukent commented Sep 9, 2019

This is the parent directory of the data -- /glade/collections/cmip/CMIP6/CMIP/NASA-GISS/GISS-E2-1-G/historical/r1i1p1f1/Omon/thetao/gn/v20180827

I will take some time this afternoon to dig through the subdirectories to the data used.

**Full path

/glade/collections/cmip/CMIP6/CMIP/NASA-GISS/GISS-E2-1-G/historical/r1i1p1f1/Omon/thetao/gn/v20180827/thetao/thetao_Omon_GISS-E2-1-G_historical_r1i1p1f1_gn_185001-187012.nc

@andersy005
Copy link
Contributor

@jukent, do we need all the 20 years worth of data or can we get a subset from this file?

@andersy005
Copy link
Contributor

andersy005 commented Sep 10, 2019

FYI:

figshare offers 20GB of free storage to account owners. I am thinking of hosting sample data (for users without an account on Cheyenne) under my figshare account if there are no objections

@jukent
Copy link
Contributor

jukent commented Sep 10, 2019

We can subset it, but we're not nearing that 20GB limit - so is it something we should do anyway?

@kmpaul
Copy link
Contributor Author

kmpaul commented Sep 10, 2019

What download speeds can we expect from figshare?

@matt-long
Copy link
Contributor

why not use the CGD ftp location we had setup for exactly this thing?

@kmpaul
Copy link
Contributor Author

kmpaul commented Sep 10, 2019

Does the CGD ftp server allow anonymous download? Who has permissions to upload?

@andersy005
Copy link
Contributor

why not use the CGD ftp location we had setup for exactly this thing?

@matt-long, I've not looked into this yet. I will give it a try before exploring the figshare option

Does the CGD ftp server allow anonymous download? Who has permissions to upload?

@kmpaul, this needs further exploration

I am going to looking into the CGD's ftp server option this afternoon, and I will let you know how this goes

@andersy005
Copy link
Contributor

@kmpaul

Does the CGD ftp server allow anonymous download?

Yes. As a smoke test you can try downloading this script:

wget ftp://ftp.cgd.ucar.edu/archive/aletheia-data/test.sh

Who has permissions to upload?

As far as I know, Matt and I are among users with upload permissions.

@andersy005
Copy link
Contributor

@matt-long,

I just figured the FTP details out, and I am in favor of using the ftp server for sample data instead of figshare.

@matt-long
Copy link
Contributor

matt-long commented Sep 10, 2019

Good. Regarding @kmpaul question,

Who has permissions to upload?

Ultimately, I think we want to setup some sort of version control and have a means of getting input from other groups, but include a gate-keeper. I don't know what the best solutions in this regard are.

@kmpaul
Copy link
Contributor Author

kmpaul commented Sep 10, 2019

I think a small ftp service with versioned files might be something Jeff DLB’s division should provide. They do this on a regular basis for things like the RDA.

@jukent
Copy link
Contributor

jukent commented Sep 10, 2019

I limited the data to 5 years, it is now < 600 MB,
The path on Cheyenne is:
/glade/u/home/jkent/tutorial_data/thetao_Omon_historical_GISS-E2-1-G_r1i1p1f1_gn_185001-185512.nc

@andersy005
Copy link
Contributor

This is fixed in #61

*** Found an existing Conda installation in: /opt/conda/bin/conda
*** (23:16:18) Step 1/6: Skipping Conda installation
*** (23:16:18) Step 2/6: Updating `base` environment
*** (23:17:48) Step 3/6: Checking for environment name conflict
*** (23:17:48) Step 4/6: Creating `python-tutorial` environment (this can take several minutes)
*** (23:21:00) Step 5/6: Running post build script for `base` environment
*** (23:22:29) Step 6/6: Running post build script for `python-tutorial` environment (this can take several minutes)
*** (23:25:45) Setup completed successfully.
==> For changes to take effect, close and re-open your current shell. <==
Currently downloading tutorial data

Progress: |--------------------------------------------------| 0.0% /root/project/data/NOAA_NCDC_ERSST_v3b_SST.nc  

Progress: |██████--------------------------------------------| 12.5% /root/project/data/air_temperature.nc  

Progress: |████████████--------------------------------------| 25.0% /root/project/data/co2.nc  

Progress: |███████████████████-------------------------------| 37.5% /root/project/data/moc.nc  

Progress: |█████████████████████████-------------------------| 50.0% /root/project/data/rasm.nc  

Progress: |███████████████████████████████-------------------| 62.5% /root/project/data/sst_indices.csv  

Progress: |██████████████████████████████████████------------| 75.0% /root/project/data/thetao_Omon_historical_GISS-E2-1-G_r1i1p1f1_gn_185001-185512.nc  

Progress: |████████████████████████████████████████████------| 87.5% /root/project/data/woa2013v2-O2-thermocline-ann.nc  

Progress: |██████████████████████████████████████████████████| 100.0% 

@andersy005
Copy link
Contributor

andersy005 commented Sep 13, 2019

Cheyenne/Casper

$ ./setup/configure --download
Creating symlink to existing/local tutorial data directory: /glade/work/abanihi/aletheia-data/tutorial-data
ln -sf /glade/work/abanihi/aletheia-data/tutorial-data/* /glade/work/abanihi/devel/ncar/NCAR-pangeo-tutorial/data/

$ ls -lh data
total 8.0K
lrwxrwxrwx 1 abanihi ncar  66 Sep 12 18:25 air_temperature.nc -> /glade/work/abanihi/aletheia-data/tutorial-data/air_temperature.nc
lrwxrwxrwx 1 abanihi ncar  54 Sep 12 18:25 co2.nc -> /glade/work/abanihi/aletheia-data/tutorial-data/co2.nc
lrwxrwxrwx 1 abanihi ncar  54 Sep 12 18:25 moc.nc -> /glade/work/abanihi/aletheia-data/tutorial-data/moc.nc
lrwxrwxrwx 1 abanihi ncar  74 Sep 12 18:25 NOAA_NCDC_ERSST_v3b_SST.nc -> /glade/work/abanihi/aletheia-data/tutorial-data/NOAA_NCDC_ERSST_v3b_SST.nc
lrwxrwxrwx 1 abanihi ncar  55 Sep 12 18:25 rasm.nc -> /glade/work/abanihi/aletheia-data/tutorial-data/rasm.nc
-rw-r--r-- 1 abanihi ncar   0 Sep  5 12:11 README.md
lrwxrwxrwx 1 abanihi ncar  63 Sep 12 18:25 sst_indices.csv -> /glade/work/abanihi/aletheia-data/tutorial-data/sst_indices.csv
lrwxrwxrwx 1 abanihi ncar 111 Sep 12 18:25 thetao_Omon_historical_GISS-E2-1-G_r1i1p1f1_gn_185001-185512.nc -> /glade/work/abanihi/aletheia-data/tutorial-data/thetao_Omon_historical_GISS-E2-1-G_r1i1p1f1_gn_185001-185512.nc
lrwxrwxrwx 1 abanihi ncar  79 Sep 12 18:25 woa2013v2-O2-thermocline-ann.nc -> /glade/work/abanihi/aletheia-data/tutorial-data/woa2013v2-O2-thermocline-ann.nc

Personal Laptop

$ ./setup/configure --download
Currently downloading tutorial data
Progress: |--------------------------------------------------| 0.0% /Users/abanihi/devel/ncar/NCAR-pangeo-tutorial/data/NOAA_NCDC_ERSST_v3b_SST.nc  
Progress: |██████--------------------------------------------| 12.5% /Users/abanihi/devel/ncar/NCAR-pangeo-tutorial/data/air_temperature.nc  
Progress: |████████████--------------------------------------| 25.0% /Users/abanihi/devel/ncar/NCAR-pangeo-tutorial/data/co2.nc  
Progress: |███████████████████-------------------------------| 37.5% /Users/abanihi/devel/ncar/NCAR-pangeo-tutorial/data/moc.nc  
Progress: |█████████████████████████-------------------------| 50.0% /Users/abanihi/devel/ncar/NCAR-pangeo-tutorial/data/rasm.nc  
Progress: |███████████████████████████████-------------------| 62.5% /Users/abanihi/devel/ncar/NCAR-pangeo-tutorial/data/sst_indices.csv  
Progress: |██████████████████████████████████████------------| 75.0% /Users/abanihi/devel/ncar/NCAR-pangeo-tutorial/data/thetao_Omon_historical_GISS-E2-1-G_r1i1p1f1_gn_185001-185512.nc  
Progress: |████████████████████████████████████████████------| 87.5% /Users/abanihi/devel/ncar/NCAR-pangeo-tutorial/data/woa2013v2-O2-thermocline-ann.nc  
Progress: |██████████████████████████████████████████████████| 100.0% 
$ ls -lh data
total 1308688
-rw-r--r--  1 abanihi  CIT\Domain Users    42M Sep 12 18:27 NOAA_NCDC_ERSST_v3b_SST.nc
-rw-r--r--  1 abanihi  CIT\Domain Users     0B Aug 28 18:22 README.md
-rw-r--r--  1 abanihi  CIT\Domain Users   7.4M Sep 12 18:27 air_temperature.nc
-rw-r--r--  1 abanihi  CIT\Domain Users   487K Sep 12 18:27 co2.nc
-rw-r--r--  1 abanihi  CIT\Domain Users   108K Sep 12 18:27 moc.nc
-rw-r--r--  1 abanihi  CIT\Domain Users    16M Sep 12 18:27 rasm.nc
-rw-r--r--  1 abanihi  CIT\Domain Users    25K Sep 12 18:27 sst_indices.csv
-rw-r--r--  1 abanihi  CIT\Domain Users   570M Sep 12 18:27 thetao_Omon_historical_GISS-E2-1-G_r1i1p1f1_gn_185001-185512.nc
-rw-r--r--  1 abanihi  CIT\Domain Users   268K Sep 12 18:27 woa2013v2-O2-thermocline-ann.nc

Hobart

$ ./setup/configure --download
Creating symlink to existing/local tutorial data directory: /ftp/archive/aletheia-data/tutorial-data
ln -sf /ftp/archive/aletheia-data/tutorial-data/* /scratch/cluster/abanihi/NCAR-pangeo-tutorial/data/
$ ls -lh data
total 12K
lrwxrwxrwx 1 abanihi cgdoce  59 Sep 12 18:28 air_temperature.nc -> /ftp/archive/aletheia-data/tutorial-data/air_temperature.nc
lrwxrwxrwx 1 abanihi cgdoce  47 Sep 12 18:28 co2.nc -> /ftp/archive/aletheia-data/tutorial-data/co2.nc
lrwxrwxrwx 1 abanihi cgdoce  47 Sep 12 18:28 moc.nc -> /ftp/archive/aletheia-data/tutorial-data/moc.nc
lrwxrwxrwx 1 abanihi cgdoce  67 Sep 12 18:28 NOAA_NCDC_ERSST_v3b_SST.nc -> /ftp/archive/aletheia-data/tutorial-data/NOAA_NCDC_ERSST_v3b_SST.nc
lrwxrwxrwx 1 abanihi cgdoce  48 Sep 12 18:28 rasm.nc -> /ftp/archive/aletheia-data/tutorial-data/rasm.nc
-rw-r--r-- 1 abanihi cgdoce   0 Sep 12 16:31 README.md
lrwxrwxrwx 1 abanihi cgdoce  56 Sep 12 18:28 sst_indices.csv -> /ftp/archive/aletheia-data/tutorial-data/sst_indices.csv
lrwxrwxrwx 1 abanihi cgdoce 104 Sep 12 18:28 thetao_Omon_historical_GISS-E2-1-G_r1i1p1f1_gn_185001-185512.nc -> /ftp/archive/aletheia-data/tutorial-data/thetao_Omon_historical_GISS-E2-1-G_r1i1p1f1_gn_185001-185512.nc
lrwxrwxrwx 1 abanihi cgdoce  72 Sep 12 18:28 woa2013v2-O2-thermocline-ann.nc -> /ftp/archive/aletheia-data/tutorial-data/woa2013v2-O2-thermocline-ann.nc

@andersy005
Copy link
Contributor

@kmpaul, @matt-long

Let me know whether this looks good and/or anything is missing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants