# Downloading a file with python

You might need for your code to download a file as part of your workflow. But if you already have it and you call the function, you might want to avoid downloading the file again, especially if it is a large file.
To do so, we will use ```urlretrieve``` form the ```urllib``` module.

In [6]:
from urllib.request import urlretrieve

def download_file(file_link):
    """
    Downloads a file from URL "file_link"
    """
    
    # We need to pass a link to the file we wish to download 
    urlretrieve(file_link, filename='file.csv')

In [8]:
target_url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00320/student.zip"

download_file(target_url)

As you can see in your directory, you have just downloaded the file with a new name. But you are downloading a .zip file. You might want to control how to save it. Let's do a better version of the function:

In [11]:
from urllib.request import urlretrieve

def download_file(file_link: str, output_file: str='file.csv'):
    """
    Downloads a file from an URL into your hard drive.
    
    Parameters
    ------------
    file_link: str
        A string containing the link to the file you wish to download.
    output_file: str
        A string containing the name of the output file. The default value is 'file.csv'
        at the location you are running the function.
        
    Returns
    ---------
    Nothing
    
    
    Example
    ---------
    download_file("https://archive.ics.uci.edu/ml/machine-learning-databases/00320/student.zip", output_file='student.zip')
    """
    
    # We need to pass a link to the file we wish to download 
    urlretrieve(file_link, filename=output_file)

In [48]:
#Just copy and paste the Example from the documentation:
download_file("https://archive.ics.uci.edu/ml/machine-learning-databases/00320/student.zip", output_file='student.zip')

OK, so far, so good. But we need to make sure the code has an additional feature:
* Do not download the file, if the file already exists

Let's handle it.

In [47]:
from urllib.request import urlretrieve
import os # we want python to be able to read what we have in our hard drive


def download_file(file_link: str, output_file: str='file.csv'):
    """
    Downloads a file from an URL into your hard drive.
    
    Parameters
    ------------
    file_link: str
        A string containing the link to the file you wish to download.
    output_file: str
        A string containing the name of the output file. The default value is 'file.csv'
        at the location you are running the function.
        
    Returns
    ---------
    Nothing
    
    
    Example
    ---------
    download_file("https://archive.ics.uci.edu/ml/machine-learning-databases/00320/student.zip", output_file='student.zip')
    """
    
    # If file doesn't exist, download it. Else, print a warning message.
    if not os.path.exists(output_file):
        urlretrieve(file_link, filename=output_file)
    else:
        print("File already exists!")

In [46]:
download_file("https://archive.ics.uci.edu/ml/machine-learning-databases/00320/student.zip", output_file='student.zip')

File already exists!


Notice that if you change the ```output_file``` variable, the code will download it again.

Our function is ready to be placed in a .py file and be used in other code we wish to develop.

The code prototyped here is now in filedownload.py