## Reading data from URLs

### Using individual URLs

In [1]:
import earthkit.data as ekd

fs = ekd.from_source("url", 
                       "https://get.ecmwf.int/repository/test-data/earthkit-data/test-data/t_pl.grib")

t_pl.grib:   0%|          | 0.00/1.41k [00:00<?, ?B/s]

In [2]:
fs.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
1,ecmf,t,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
2,ecmf,t,isobaricInhPa,700,20180801,1200,0,an,0,regular_ll
3,ecmf,t,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll
4,ecmf,t,isobaricInhPa,400,20180801,1200,0,an,0,regular_ll
5,ecmf,t,isobaricInhPa,300,20180801,1200,0,an,0,regular_ll


Tar and zip archives can also be loaded from a URL:

In [3]:
fs = ekd.from_source("url", 
                       "https://get.ecmwf.int/repository/test-data/earthkit-data/examples/test_gribs.tar")

test_gribs.tar:   0%|          | 0.00/513k [00:00<?, ?B/s]

  0%|          | 0/2 [00:00<?, ?it/s]

In [4]:
fs.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,2t,surface,0,20200513,1200,0,an,0,regular_ll
1,ecmf,msl,surface,0,20200513,1200,0,an,0,regular_ll
2,ecmf,t,isobaricInhPa,500,20070101,1200,0,an,0,regular_ll
3,ecmf,z,isobaricInhPa,500,20070101,1200,0,an,0,regular_ll
4,ecmf,t,isobaricInhPa,850,20070101,1200,0,an,0,regular_ll
5,ecmf,z,isobaricInhPa,850,20070101,1200,0,an,0,regular_ll


### Using multiple URLs

We can access a list of URLs in one go. In the example below the first file contains 2 fields while the second one 4 fields.

In [5]:
fs = ekd.from_source("url", 
                       ["https://get.ecmwf.int/repository/test-data/earthkit-data/examples/test.grib",
                        "https://get.ecmwf.int/repository/test-data/earthkit-data/examples/test4.grib"])
fs.ls()

<multiple>:   0%|          | 0.00/511k [00:00<?, ?B/s]

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,2t,surface,0,20200513,1200,0,an,0,regular_ll
1,ecmf,msl,surface,0,20200513,1200,0,an,0,regular_ll
2,ecmf,t,isobaricInhPa,500,20070101,1200,0,an,0,regular_ll
3,ecmf,z,isobaricInhPa,500,20070101,1200,0,an,0,regular_ll
4,ecmf,t,isobaricInhPa,850,20070101,1200,0,an,0,regular_ll
5,ecmf,z,isobaricInhPa,850,20070101,1200,0,an,0,regular_ll


### Using URL patterns

In [6]:
fs = ekd.from_source("url-pattern",                        
                        "https://get.ecmwf.int/repository/test-data/earthkit-data/examples/test{id}.grib",
                        {"id": [4, 6]})
fs.ls()

  0%|          | 0/2 [00:00<?, ?it/s]

test4.grib:   0%|          | 0.00/509k [00:00<?, ?B/s]

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,500,20070101,1200,0,an,0,regular_ll
1,ecmf,z,isobaricInhPa,500,20070101,1200,0,an,0,regular_ll
2,ecmf,t,isobaricInhPa,850,20070101,1200,0,an,0,regular_ll
3,ecmf,z,isobaricInhPa,850,20070101,1200,0,an,0,regular_ll
4,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
5,ecmf,u,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
6,ecmf,v,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
7,ecmf,t,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
8,ecmf,u,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
9,ecmf,v,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll


We can specify a format for each pattern. In this example "my_date" is the pattern name and ":date(%Y-%m-%d)" specifies the format:

In [7]:
import datetime 

fs = ekd.from_source(
    "url-pattern",                        
    "https://get.ecmwf.int/repository/test-data/earthkit-data/test-data/test_{my_date:date(%Y-%m-%d)}_{name}.grib",
    {"my_date": datetime.datetime(2020,5,13), "name": ["t2","msl"]})
fs.ls()


  0%|          | 0/2 [00:00<?, ?it/s]

test_2020-05-13_t2.grib:   0%|          | 0.00/600 [00:00<?, ?B/s]

test_2020-05-13_msl.grib:   0%|          | 0.00/600 [00:00<?, ?B/s]

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,2t,surface,0,20200513,1200,0,an,0,regular_ll
1,ecmf,msl,surface,0,20200513,1200,0,an,0,regular_ll
