
## Downloading Multiple files

### Often, it is possible to work out from a few files that there is a pattern in the way files and hence the URLs are created

### In this example we will make use of such a pattern to allow us to download multiple files

### The libraries we need

In [None]:
import requests
import pandas as pd

## The code below is similar to what we have seen before

### The file comes from the Webpage `https://www.getthedata.com/covid-19/utla-by-day`

### With the link to the download file shown as `cases_by_utla_2020-03-16.csv`
### This is infact a relative address from the website.

## But the FULL url of the actual file we would download is shown at the bottom left of the screen as we hover over the link. It is:

##  ![Actual Filename](./images/actual_filename.jpg)

## If we want to download the file by other means then this is the URL that we must use.

In [None]:

myurl = 'https://www.getthedata.com/downloads/cases_by_utla_2020-03-16.csv'

savefilename = '200316.csv'
r = requests.get(myurl)
print(r.status_code)
file = open(savefilename, "wb")
file.write(r.content)
file.close()

## If we wanted to download multiple files we can use the structure of the filenames and requests the downloads in a loop structure
### Like we see below. 
### As we want to save the files we use a similar technique to name the output files.

In [None]:
stem = 'https://www.getthedata.com/downloads/cases_by_utla_2020-03-'
ftype = '.csv'

ym = 'd:/covid19'
date_stem = '2020-03-'

for i in range(16,32):
    print(f'{stem}{i}{ftype}')
    print(f'{ym}/{date_stem}{i}{ftype}')    

## Then we can put it all together

In [None]:
stem = 'https://www.getthedata.com/downloads/cases_by_utla_2020-03-'
ftype = '.csv'

ym = 'd:/covid19'
date_stem = '2020-03-'

for i in range(16,32):
    # set up URL and filename
    myurl = stem + str(i) + ftype
    savefilename = ym + "/" + date_stem + str(i) + ftype
    
    # Make the request
    r = requests.get(myurl)
    print(r.status_code)
    
    # Write the output file
    file = open(savefilename, "wb")
    file.write(r.content)
    file.close()


## Exercise

### When will the above code run to problems?

### There are two problems both related to how the we iterate through the dates.

### We only downloaded the files to the end of March. If we wanted to go into April, the `date_stem` would need changing but also the the valur of `str(i)` for the first nine days would not have the required leading zero.

### We can resolve both of these problems at the same time datetime methods in pandas.


In [None]:
import pandas as pd

# Getting a list of dates

sdate = '2020-03-16'
edate = '2020-12-31'

date_list = pd.date_range(sdate, edate, freq='d')

print(date_list)
#

print(f'{date_list[0]}')
#

date_list[0].strftime('%Y-%m-%d')

## Things I need to change

1. The `stem`
2. Create list of dates
3. Iterate through dates in `for` loop
4. change `myurl` to include the string format of the date
5. cchange `savefilename` to include the string format of the date

In [None]:
stem = 'https://www.getthedata.com/downloads/cases_by_utla_'
ftype = '.csv'

ym = 'd:/covid19'

# Getting a list of dates

sdate = '2020-03-16'
edate = '2020-12-31'

date_list = pd.date_range(sdate, edate, freq='d')

for date in date_list:
    # set up URL and filename
    myurl = stem + date.strftime('%Y-%m-%d') + ftype
    savefilename = ym + "/" + date.strftime('%Y-%m-%d') + ftype
    
    # Make the request
    r = requests.get(myurl)
    print(r.status_code)
    
    # Write the output file
    file = open(savefilename, "wb")
    file.write(r.content)
    file.close()

## Once downloaded we can use Excel Data | Import from other sources | from folder  to combine the files into one.