# üìù init_00_ECF_Setup

Download & Unzip ECF (Extreme Climate Forecasting) Dataset (`data_clima.csv.zip`) into Google Drive.

## üèóÔ∏è Environment & Directory Setup

In [1]:
from google.colab import drive
import os

# Mount Google Drive
drive.mount('/content/drive')

# Set project directory
project_dir = "/content/drive/MyDrive/extreme-climate-forecasting"
os.makedirs(project_dir, exist_ok=True)
os.chdir(project_dir)

print("Working directory:", os.getcwd())

Mounted at /content/drive
Working directory: /content/drive/MyDrive/extreme-climate-forecasting


In [2]:
# # !mkdir -p /content/drive/MyDrive/extreme-climate-forecastingrecasting
# !cd /content/drive/MyDrive/extreme-climate-forecasting

## ‚¨áÔ∏è Install gdown (if not already installed)

In [3]:
try:
    import gdown
    print("gdown is already installed")
except ImportError:
    print("gdown not found. Installing...")
    import subprocess
    import sys
    subprocess.check_call([sys.executable, "-m", "pip", "install", "gdown"])
    import gdown
    print("‚úì gdown installed successfully")

gdown is already installed


In [4]:
# !pip install gdown

## üîΩ Download `data_clima.csv` from Google Drive

In [5]:
zip_filename = "data_clima.csv.zip"
zip_path = f"{project_dir}/{zip_filename}"
file_id = "1xPhrvg30Ull9KqtU_3TK6ZJlBSTuaH8A"

print("Downloading data_clima.csv.zip...")
gdown.download(id=file_id, output=zip_path, quiet=False)
print("‚úì Download complete")

Downloading data_clima.csv.zip...


Downloading...
From (original): https://drive.google.com/uc?id=1xPhrvg30Ull9KqtU_3TK6ZJlBSTuaH8A
From (redirected): https://drive.google.com/uc?id=1xPhrvg30Ull9KqtU_3TK6ZJlBSTuaH8A&confirm=t&uuid=9505ebbc-2f1c-47ae-9de1-21e50fd1ed81
To: /content/drive/MyDrive/extreme-climate-forecasting/data_clima.csv.zip
100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 177M/177M [00:01<00:00, 120MB/s]


‚úì Download complete


In [6]:
# !gdown --id 1xPhrvg30Ull9KqtU_3TK6ZJlBSTuaH8A -O /content/drive/MyDrive/extreme-climate-forecasting/data_clima.csv.zip

## üì§ Unzip the dataset

In [7]:
import zipfile
import time
from tqdm import tqdm
import os

# Create the target extraction folder
extract_path = os.path.join(project_dir, "data")
os.makedirs(extract_path, exist_ok=True)

zip_file = zip_path

print(f"Extracting {zip_filename} into {extract_path}...")

start_time = time.time()

with zipfile.ZipFile(zip_file, 'r') as zip_ref:
    file_list = zip_ref.namelist()

    for file in tqdm(file_list, desc="Extracting", unit="file"):
        zip_ref.extract(member=file, path=extract_path)

elapsed = time.time() - start_time

print(f"\n‚úì Extraction complete in {elapsed:.2f} seconds")

Extracting data_clima.csv.zip into /content/drive/MyDrive/extreme-climate-forecasting/data...


Extracting: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 2/2 [00:13<00:00,  6.64s/file]


‚úì Extraction complete in 13.30 seconds





In [8]:
# # # For very large ZIPs, Python‚Äôs zipfile can be slower than the system unzip command.

# # Create the target extraction directory (if it doesn't exist)
# !mkdir -p /content/drive/MyDrive/extreme-climate-forecasting/data
# !unzip /content/drive/MyDrive/extreme-climate-forecasting/data_clima.csv.zip -d /content/drive/MyDrive/extreme-climate-forecasting/data

## üßπ Storage Cleanup
Remove .zip file

In [9]:
# Delete the file directly
import os

# Define the path to the zip file
project_dir = "/content/drive/MyDrive/extreme-climate-forecasting"
zip_path = os.path.join(project_dir, zip_filename)

# Check if file exists before deleting to avoid errors
if os.path.exists(zip_path):
    os.remove(zip_path)
    print(f"‚úì {zip_path} has been deleted.")
else:
    print("File not found. It may have already been deleted.")

‚úì /content/drive/MyDrive/extreme-climate-forecasting/data_clima.csv.zip has been deleted.


In [10]:
# !rm /content/drive/MyDrive/extreme-climate-forecasting/data_clima.csv.zip

## üîç Verification

In [11]:
import os

def print_tree(root_dir):
    print(f"Final project structure under: {root_dir}")
    for current_path, dirs, files in os.walk(root_dir):
        indent_level = current_path.replace(root_dir, "").count(os.sep)
        indent = "    " * indent_level
        print(f"{indent}{os.path.basename(current_path)}/")
        for f in files:
            print(f"{indent}    {f}")

print_tree("/content/drive/MyDrive/extreme-climate-forecasting/data")

Final project structure under: /content/drive/MyDrive/extreme-climate-forecasting/data
data/
    data_clima.csv
    __MACOSX/
        ._data_clima.csv


In [12]:
# !ls -R /content/drive/MyDrive/extreme-climate-forecasting/data

## Tree

In [13]:
!apt install tree

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  tree
0 upgraded, 1 newly installed, 0 to remove and 2 not upgraded.
Need to get 47.9 kB of archives.
After this operation, 116 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 tree amd64 2.0.2-1 [47.9 kB]
Fetched 47.9 kB in 1s (84.8 kB/s)
Selecting previously unselected package tree.
(Reading database ... 117540 files and directories currently installed.)
Preparing to unpack .../tree_2.0.2-1_amd64.deb ...
Unpacking tree (2.0.2-1) ...
Setting up tree (2.0.2-1) ...
Processing triggers for man-db (2.10.2-1) ...


In [14]:
!tree /content/drive/MyDrive/extreme-climate-forecasting/

[01;34m/content/drive/MyDrive/extreme-climate-forecasting/[0m
‚îú‚îÄ‚îÄ [01;34mdata[0m
‚îÇ¬†¬† ‚îú‚îÄ‚îÄ [00mdata_clima.csv[0m
‚îÇ¬†¬† ‚îî‚îÄ‚îÄ [01;34m__MACOSX[0m
‚îî‚îÄ‚îÄ [00minit_00_ECF_Setup.ipynb[0m

2 directories, 2 files
