Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python SubX download_data produces an error before scripts can be created #18

Open
kdl0013 opened this issue Dec 6, 2021 · 3 comments

Comments

@kdl0013
Copy link

kdl0013 commented Dec 6, 2021

In file, SubX/Python/download_data/generate_full_py_ens_files.ksh the code will produce a list of all files, but it immediately fails at fen='python tmp.py' and produces the following error:

RuntimeError: NetCDF: Access failure
oc_open: server error retrieving url: code=6 message="request too large"

The link provided https://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/.RSMAS/.CCSM4/.hindcast/.tas/dods does not appear to have any files to open which may be the problem. IRIDL may have updated the site to not allow dods information to populate through.

@raghu330
Copy link

I too face the same issue "oc_open: server error retrieving url: code=6 message="request too large"". I added 'decode_times=False', but the error still persists.

@kdl0013
Copy link
Author

kdl0013 commented Feb 16, 2022

I too face the same issue "oc_open: server error retrieving url: code=6 message="request too large"". I added 'decode_times=False', but the error still persists.

@raghu330 You can use this python script I created to download SubX data. This script will create a wget list and you can run that script to download the files.

#!/usr/bin/env python3

'''Create wget script for each SubX model to setup download of files in parallel 

Inputs: IRIDL NCAR username and password
Inputs -- you can change the model and variables that you want downloaded

Outputs: Shell script which contains download info. Files will download 20 at a time when run.

'''
import os
import datetime as dt
import numpy as np


username_IRIDL = "usrname"
password_IRIDL = "psswd"

models = ['GMAO']
sources = ['GEOS_V2p1']
#vars = ['huss', 'dswrf','mrso','tas', 'uas', 'vas','tdps','pr','cape']
vars = ['tasmax','tasmin']

# get the dates for all SubX models (this is all the possible dates) I started at year 2000 because other SubX models have all started by then
start_date = dt.date(2000, 1, 5)
end_date = dt.date(2015, 12, 26)

dates = [start_date + dt.timedelta(days=d) for d in range(0, end_date.toordinal() - start_date.toordinal() + 1)]
#GMAO GEOS specifically skips leap days
dates_out = []
for i in dates:
    if i.month == 2 and i.day == 29:
        pass
    else:
        dates_out.append(i)

#Only every 5th date
dates=dates_out[::5]

### GMAO GEOS_V2p1 model


#vars=['tas']


new_dir=(f'/glade/scratch/klesinger/SubX/{models[0]}')
os.system(f'mkdir {new_dir}')

count=0    
output=[]
output.append('#!/bin/bash')
for m_i, model in enumerate(models):
    
    for d_i, date in enumerate(dates):
        
        for v_i, var in enumerate(vars):
        
            date_str = '{}-{}-{}'.format(str(date.year), str(date.month).rjust(2,'0'), str(date.day).rjust(2,'0'))

            command = f"wget -nc --user {username_IRIDL} --password {password_IRIDL} 'http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/.{model}/.{sources[m_i]}/.hindcast/.{var}/S/(1200%20{str(date.day)}%20{date.strftime('%b')}%20{str(date.year)})/VALUES/data.nc' {new_dir}/{var}_{model}_{date_str}.nc &"
            
            if count % 20 == 0:
                output.append('wait')
            
            count+=1
            output.append(command)
            
np.savetxt('wget_GMAO.txt',output, fmt="%s")

os.system("cat wget_GMAO.txt > wget_GMAO.sh")
os.system("rm wget_GMAO.txt")

After running the script, use command line to bash wget_GMAO.sh

@raghu330
Copy link

Hello @kdl0013 Thanks very much for the response and for an excellent wget solution. My apologies for the delay in responding. I was able to help the person who came up with his issue to me using a similar but convoluted approach, but yours is better optimized. Since then, I am yet to hear further from him again, so my involvement has reverted back to my official duties. Thank you very much!
Cheers
Raghu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants