# DEMO: S3 connection (read)

[EN] This notebook shows how to read files from a S3 bucket.

[ES] En este notebook se muestra cómo leer ficheros en un bucket de S3.

### 1. Package Installation  

In [None]:
!pip install boto3
!pip install pandas
!pip install StringIO

### 2. Package import

[EN] The needed packages are:

- **boto3**: AWS functionalities access.
- **pandas**:  Python library specialized in data manipulation and analysis.
- **StringIO**: a text stream using an in-memory text buffer.

[ES] A continuación, se importa los paquetes que son necesarios en esta demo:

- **boto3**: acceso a las funcionalidades de AWS.
- **pandas**:  librería de Python especializada en la manipulación y el análisis de datos.
- **StringIO**: crea un steam de texto usando almacenamiento del texto en memoria.

In [None]:
import boto3
import pandas as pd
from io import StringIO

### 3. Previous configuration  

[EN] Define the bucket and the file location.

[ES] Debe definir el bucket y la localización del fichero.

In [None]:
BUCKET_NAME = 'data-clinic-hackathon-2024'
BUCKET_FILE_LOCATION_AND_NAME = 'BASE_DEMO/data_demo.csv'

### 4. READING function  

#### 4.1 CSV file reading

[EN] Function that allows to read a csv file from a S3 bucket and save its content to a pandas dataframe.

[ES] Función que permite leer un fichero csv de un bucket en S3 y guardar su contenido en un pandas dataframe.

In [None]:
def read_csv_from_bucket(bucket_name, file_name):
    try:
        s3_client = boto3.client('s3')
        
        # Read S3 objet
        response = s3_client.get_object(Bucket=bucket_name, Key=file_name)

        # Read the file content as bytes
        content = response['Body'].read()

        # Convert bytes to string buffer
        csv_content = content.decode('utf-8')
        
        # Load the content into pandas dataframe
        df = pd.read_csv(StringIO(csv_content))

        return df

    except Exception as e:
        print(f"Error al leer el archivo desde el bucket: {e}")
        return None
        

## EXAMPLE

In [None]:
df = read_csv_from_bucket(BUCKET_NAME, BUCKET_FILE_LOCATION_AND_NAME)

if df is not None:
    print(df)

#### 4.2 TXT file reading

[EN] Function that allows to read a txt file from a S3 bucket and save its content in a variable.

[ES] Función que permite leer un fichero txt de un bucket en S3 y guardar su contenido en un variable.

In [None]:
def read_txt_from_bucket(bucket_name, file_name):
    try:
        s3_client = boto3.client('s3')
        
        # Obtain S3 object
        response = s3_client.get_object(Bucket=bucket_name, Key=file_name)

        # Read the file content as bytes
        content = response['Body'].read()

        return content

    except Exception as e:
        print(f"Error al leer el archivo desde el bucket: {e}")
        return None

## EXAMPLE

In [None]:
content = read_txt_from_bucket(BUCKET_NAME, BUCKET_FILE_LOCATION_AND_NAME)

print(content)