In [1]:
from IPython.display import Markdown

import os
import openai
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.environ['OPENAI_API_KEY']

def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=0, max_tokens=1000):
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature,
        max_tokens=max_tokens,
    )
    return response.choices[0].message["content"]

In [9]:
instructions = """
Generate a python function that would download a file on the internet and save it in the local directory named `datasets`.
If the file already exists in the directory, the file wont be downloaded again. Instead it should print the directory tree relative to the `datasets` folder

If the file does not exist yet in the directory, extract the zip file and print the directory tree relative to the `datasets` folder.


return your results in markdown.
"""

messages = [
    {
        "role": "user",
        "content": instructions
    },
]

results = get_completion_from_messages(messages=messages)
Markdown(results)

```python
import os
import urllib.request
import zipfile

def download_file(url, filename):
    # Check if file already exists in the directory
    if os.path.exists(f"datasets/{filename}"):
        # Print the directory tree relative to the datasets folder
        for root, dirs, files in os.walk("datasets"):
            level = root.replace("datasets", "").count(os.sep)
            indent = " " * 4 * (level)
            print(f"{indent}{os.path.basename(root)}/")
            subindent = " " * 4 * (level + 1)
            for f in files:
                print(f"{subindent}{f}")
    else:
        # Download the file from the internet
        urllib.request.urlretrieve(url, f"datasets/{filename}")
        print(f"File '{filename}' downloaded successfully.")
        
        # Extract the zip file
        with zipfile.ZipFile(f"datasets/{filename}", 'r') as zip_ref:
            zip_ref.extractall("datasets")
        
        # Print the directory tree relative to the datasets folder
        for root, dirs, files in os.walk("datasets"):
            level = root.replace("datasets", "").count(os.sep)
            indent = " " * 4 * (level)
            print(f"{indent}{os.path.basename(root)}/")
            subindent = " " * 4 * (level + 1)
            for f in files:
                print(f"{subindent}{f}")
```

To use this function, you can call it with the URL of the file you want to download and the desired filename:

```python
download_file("https://example.com/dataset.zip", "dataset.zip")
```

If the file already exists in the `datasets` directory, it will print the directory tree relative to the `datasets` folder. If the file does not exist yet, it will download the file, extract the zip file, and then print the directory tree.

In [10]:
instructions2 = """
Explain the code you've made.
It should be written in short and concise manner.
The tone should be professional, academic, technical, formal, but compelling.

Return your results in markdown.
"""

messages = [
    {
        "role": "user",
        "content": instructions
    },
    {
        "role": "user",
        "content": instructions2
    },
]

results = get_completion_from_messages(messages=messages)
Markdown(results)

Sure! Here is the explanation of the code:

```python
import os
import urllib.request
import zipfile

def download_and_extract_file(url, filename):
    # Check if the file already exists in the directory
    if os.path.exists('datasets/' + filename):
        # Print the directory tree relative to the datasets folder
        for root, dirs, files in os.walk('datasets'):
            level = root.replace('datasets', '').count(os.sep)
            indent = ' ' * 4 * (level)
            print('{}{}/'.format(indent, os.path.basename(root)))
            subindent = ' ' * 4 * (level + 1)
            for f in files:
                print('{}{}'.format(subindent, f))
        return
    
    # Download the file from the internet
    urllib.request.urlretrieve(url, 'datasets/' + filename)
    
    # Extract the zip file
    with zipfile.ZipFile('datasets/' + filename, 'r') as zip_ref:
        zip_ref.extractall('datasets')
    
    # Print the directory tree relative to the datasets folder
    for root, dirs, files in os.walk('datasets'):
        level = root.replace('datasets', '').count(os.sep)
        indent = ' ' * 4 * (level)
        print('{}{}/'.format(indent, os.path.basename(root)))
        subindent = ' ' * 4 * (level + 1)
        for f in files:
            print('{}{}'.format(subindent, f))
```

Explanation:

1. The code imports the necessary modules: `os` for file and directory operations, `urllib.request` for downloading files from the internet, and `zipfile` for extracting zip files.

2. The function `download_and_extract_file` takes two parameters: `url` (the URL of the file to be downloaded) and `filename` (the name of the file to be saved in the local directory).

3. It checks if the file already exists in the `datasets` directory using the `os.path.exists` function. If it exists, it prints the directory tree relative to the `datasets` folder using the `os.walk` function.

4. If the file does not exist, it downloads the file from the internet using `urllib.request.urlretrieve` and saves it in the `datasets` directory.

5. It then extracts the zip file using the `zipfile.ZipFile` context manager and the `extractall` method, saving the extracted files in the `datasets` directory.

6. Finally, it prints the directory tree relative to the `datasets` folder using the `os.walk` function.

This code allows for downloading and extracting files from the internet, while also checking if the file already exists in the local directory. It provides a convenient way to manage datasets and avoid unnecessary downloads.

In [12]:
print(results)

Sure! Here is the explanation of the code:

```python
import os
import urllib.request
import zipfile

def download_and_extract_file(url, filename):
    # Check if the file already exists in the directory
    if os.path.exists('datasets/' + filename):
        # Print the directory tree relative to the datasets folder
        for root, dirs, files in os.walk('datasets'):
            level = root.replace('datasets', '').count(os.sep)
            indent = ' ' * 4 * (level)
            print('{}{}/'.format(indent, os.path.basename(root)))
            subindent = ' ' * 4 * (level + 1)
            for f in files:
                print('{}{}'.format(subindent, f))
        return
    
    # Download the file from the internet
    urllib.request.urlretrieve(url, 'datasets/' + filename)
    
    # Extract the zip file
    with zipfile.ZipFile('datasets/' + filename, 'r') as zip_ref:
        zip_ref.extractall('datasets')
    
    # Print the directory tree relative to the datasets folder
    for 