# Universidad Internacional de La Rioja  

### Máster Universitario en Visual Analytics and Big Data  

---

### **Predicción y Análisis de la Demanda y Suministro de Productos entre la Comunidad Andina y España**  

### **Trabajo Fin de Estudio**  
- **Presentado por:** Danilo Andrés Beleño Villafañe  

---

## **Codigo 1: Paso de la Zona de Tránsito a la Zona de Datos Crudos**  


In [1]:
import time
import zipfile
import logging
from io import BytesIO
from google.cloud import storage
from concurrent.futures import ThreadPoolExecutor

In [2]:
logging.basicConfig(level=logging.INFO)

In [3]:
def process_blob(blob, destination_bucket, target_zip_folder):
    try:
        logging.info(f'Processing {blob.name}')
        zip_data = blob.download_as_bytes()
        zip_file = BytesIO(zip_data)

        with zipfile.ZipFile(zip_file, 'r') as z:
            for filename in z.namelist():
                logging.info(f'Extracting {filename}')

                file_data = z.read(filename)
                destination_blob_name = f'{target_zip_folder}{filename}'
                destination_blob = destination_bucket.blob(destination_blob_name)

                destination_blob.upload_from_string(file_data)
                logging.info(f'{filename} extracted to {destination_blob_name}')
        zip_file.close()
    except Exception as e:
        logging.error(f"Error processing {blob.name}: {e}")

In [4]:
def extract_zip_to_bucket(source_bucket_name, destination_bucket_name, zip_folder, target_zip_folder):
    client = storage.Client()
    source_bucket = client.bucket(source_bucket_name)
    destination_bucket = client.bucket(destination_bucket_name)

    blobs = source_bucket.list_blobs(prefix=zip_folder)

    with ThreadPoolExecutor(max_workers=3) as executor:
        for blob in blobs:
            if blob.name.endswith('.zip'):
                executor.submit(process_blob, blob, destination_bucket, target_zip_folder)

In [5]:
transient_zone = 'data-factory-0-transient-zone'
raw_data_zone = 'data-factory-1-raw-data-zone'
source_data_folder = 'data/datacomex/taric/'
target_data_folder = 'data/datacomex/taric/'

In [6]:
start_time = time.time()

extract_zip_to_bucket(transient_zone, raw_data_zone, source_data_folder, target_data_folder)

end_time = time.time()

INFO:root:Processing data/datacomex/taric/comex_taric_199501.zip
INFO:root:Processing data/datacomex/taric/comex_taric_199502.zip
INFO:root:Processing data/datacomex/taric/comex_taric_199503.zip
INFO:root:Extracting comex_taric_199501.csv
INFO:root:Extracting comex_taric_199502.csv
INFO:root:Extracting comex_taric_199503.csv
INFO:root:comex_taric_199501.csv extracted to data/datacomex/taric/comex_taric_199501.csv
INFO:root:Processing data/datacomex/taric/comex_taric_199504.zip
INFO:root:Extracting comex_taric_199504.csv
INFO:root:comex_taric_199502.csv extracted to data/datacomex/taric/comex_taric_199502.csv
INFO:root:Processing data/datacomex/taric/comex_taric_199505.zip
INFO:root:comex_taric_199503.csv extracted to data/datacomex/taric/comex_taric_199503.csv
INFO:root:Processing data/datacomex/taric/comex_taric_199506.zip
INFO:root:Extracting comex_taric_199505.csv
INFO:root:Extracting comex_taric_199506.csv
INFO:root:comex_taric_199504.csv extracted to data/datacomex/taric/comex_tar

INFO:root:Extracting comex_taric_199807.csv
INFO:root:comex_taric_199805.csv extracted to data/datacomex/taric/comex_taric_199805.csv
INFO:root:Processing data/datacomex/taric/comex_taric_199808.zip
INFO:root:Extracting comex_taric_199808.csv
INFO:root:comex_taric_199806.csv extracted to data/datacomex/taric/comex_taric_199806.csv
INFO:root:Processing data/datacomex/taric/comex_taric_199809.zip
INFO:root:Extracting comex_taric_199809.csv
INFO:root:comex_taric_199808.csv extracted to data/datacomex/taric/comex_taric_199808.csv
INFO:root:Processing data/datacomex/taric/comex_taric_199810.zip
INFO:root:Extracting comex_taric_199810.csv
INFO:root:comex_taric_199807.csv extracted to data/datacomex/taric/comex_taric_199807.csv
INFO:root:Processing data/datacomex/taric/comex_taric_199811.zip
INFO:root:Extracting comex_taric_199811.csv
INFO:root:comex_taric_199809.csv extracted to data/datacomex/taric/comex_taric_199809.csv
INFO:root:Processing data/datacomex/taric/comex_taric_199812.zip
INFO:

INFO:root:comex_taric_200110.csv extracted to data/datacomex/taric/comex_taric_200110.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200201.zip
INFO:root:Extracting comex_taric_200201.csv
INFO:root:comex_taric_200111.csv extracted to data/datacomex/taric/comex_taric_200111.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200202.zip
INFO:root:Extracting comex_taric_200202.csv
INFO:root:comex_taric_200112.csv extracted to data/datacomex/taric/comex_taric_200112.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200203.zip
INFO:root:Extracting comex_taric_200203.csv
INFO:root:comex_taric_200201.csv extracted to data/datacomex/taric/comex_taric_200201.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200204.zip
INFO:root:Extracting comex_taric_200204.csv
INFO:root:comex_taric_200202.csv extracted to data/datacomex/taric/comex_taric_200202.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200205.zip
INFO:root:Extracting comex_taric_200205.csv
INFO:

INFO:root:Processing data/datacomex/taric/comex_taric_200506.zip
INFO:root:Extracting comex_taric_200506.csv
INFO:root:comex_taric_200504.csv extracted to data/datacomex/taric/comex_taric_200504.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200507.zip
INFO:root:Extracting comex_taric_200507.csv
INFO:root:comex_taric_200505.csv extracted to data/datacomex/taric/comex_taric_200505.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200508.zip
INFO:root:Extracting comex_taric_200508.csv
INFO:root:comex_taric_200506.csv extracted to data/datacomex/taric/comex_taric_200506.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200509.zip
INFO:root:Extracting comex_taric_200509.csv
INFO:root:comex_taric_200507.csv extracted to data/datacomex/taric/comex_taric_200507.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200510.zip
INFO:root:Extracting comex_taric_200510.csv
INFO:root:comex_taric_200508.csv extracted to data/datacomex/taric/comex_taric_200508.csv
INFO:

INFO:root:Extracting comex_taric_200811.csv
INFO:root:comex_taric_200809.csv extracted to data/datacomex/taric/comex_taric_200809.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200812.zip
INFO:root:Extracting comex_taric_200812.csv
INFO:root:comex_taric_200810.csv extracted to data/datacomex/taric/comex_taric_200810.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200901.zip
INFO:root:Extracting comex_taric_200901.csv
INFO:root:comex_taric_200811.csv extracted to data/datacomex/taric/comex_taric_200811.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200902.zip
INFO:root:Extracting comex_taric_200902.csv
INFO:root:comex_taric_200812.csv extracted to data/datacomex/taric/comex_taric_200812.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200903.zip
INFO:root:Extracting comex_taric_200903.csv
INFO:root:comex_taric_200901.csv extracted to data/datacomex/taric/comex_taric_200901.csv
INFO:root:Processing data/datacomex/taric/comex_taric_200904.zip
INFO:

INFO:root:comex_taric_201202.csv extracted to data/datacomex/taric/comex_taric_201202.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201205.zip
INFO:root:Extracting comex_taric_201205.csv
INFO:root:comex_taric_201203.csv extracted to data/datacomex/taric/comex_taric_201203.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201206.zip
INFO:root:Extracting comex_taric_201206.csv
INFO:root:comex_taric_201204.csv extracted to data/datacomex/taric/comex_taric_201204.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201207.zip
INFO:root:Extracting comex_taric_201207.csv
INFO:root:comex_taric_201205.csv extracted to data/datacomex/taric/comex_taric_201205.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201208.zip
INFO:root:Extracting comex_taric_201208.csv
INFO:root:comex_taric_201206.csv extracted to data/datacomex/taric/comex_taric_201206.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201209.zip
INFO:root:Extracting comex_taric_201209.csv
INFO:

INFO:root:Processing data/datacomex/taric/comex_taric_201510.zip
INFO:root:comex_taric_201507.csv extracted to data/datacomex/taric/comex_taric_201507.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201511.zip
INFO:root:Extracting comex_taric_201510.csv
INFO:root:Extracting comex_taric_201511.csv
INFO:root:comex_taric_201509.csv extracted to data/datacomex/taric/comex_taric_201509.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201512.zip
INFO:root:Extracting comex_taric_201512.csv
INFO:root:comex_taric_201510.csv extracted to data/datacomex/taric/comex_taric_201510.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201601.zip
INFO:root:Extracting comex_taric_201601.csv
INFO:root:comex_taric_201511.csv extracted to data/datacomex/taric/comex_taric_201511.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201602.zip
INFO:root:Extracting comex_taric_201602.csv
INFO:root:comex_taric_201512.csv extracted to data/datacomex/taric/comex_taric_201512.csv
INFO:

INFO:root:Extracting comex_taric_201903.csv
INFO:root:comex_taric_201901.csv extracted to data/datacomex/taric/comex_taric_201901.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201904.zip
INFO:root:Extracting comex_taric_201904.csv
INFO:root:comex_taric_201902.csv extracted to data/datacomex/taric/comex_taric_201902.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201905.zip
INFO:root:Extracting comex_taric_201905.csv
INFO:root:comex_taric_201903.csv extracted to data/datacomex/taric/comex_taric_201903.csv
INFO:root:Extracting comex_taric_201912.zip
INFO:root:comex_taric_201912.zip extracted to data/datacomex/taric/comex_taric_201912.zip
INFO:root:Processing data/datacomex/taric/comex_taric_201906.zip
INFO:root:Extracting comex_taric_201906.csv
INFO:root:comex_taric_201904.csv extracted to data/datacomex/taric/comex_taric_201904.csv
INFO:root:Processing data/datacomex/taric/comex_taric_201907.zip
INFO:root:Extracting comex_taric_201907.csv
INFO:root:comex_taric_2019

INFO:root:Processing data/datacomex/taric/comex_taric_202208.zip
INFO:root:Extracting comex_taric_202208.csv
INFO:root:comex_taric_202206.csv extracted to data/datacomex/taric/comex_taric_202206.csv
INFO:root:Processing data/datacomex/taric/comex_taric_202209.zip
INFO:root:Extracting comex_taric_202209.csv
INFO:root:comex_taric_202207.csv extracted to data/datacomex/taric/comex_taric_202207.csv
INFO:root:Processing data/datacomex/taric/comex_taric_202210.zip
INFO:root:Extracting comex_taric_202210.csv
INFO:root:comex_taric_202208.csv extracted to data/datacomex/taric/comex_taric_202208.csv
INFO:root:Processing data/datacomex/taric/comex_taric_202211.zip
INFO:root:Extracting comex_taric_202211.csv
INFO:root:comex_taric_202209.csv extracted to data/datacomex/taric/comex_taric_202209.csv
INFO:root:Processing data/datacomex/taric/comex_taric_202212.zip
INFO:root:Extracting comex_taric_202212.csv
INFO:root:comex_taric_202210.csv extracted to data/datacomex/taric/comex_taric_202210.csv
INFO:

In [7]:
elapsed_time = end_time - start_time

hours, remainder = divmod(elapsed_time, 3600)
minutes, seconds = divmod(remainder, 60)

print(f"The execution time was: {int(hours)} hours, {int(minutes)} minutes, and {int(seconds)} seconds.")

The execution time was: 0 hours, 8 minutes, and 9 seconds.
