# Transformações da Camada Silver (Dados Brutos para Dados Limpos):  

## Objetivo 
O objetivo aqui é limpar e padronizar os dados antes de mover para a camada Gold.

### Tarefas
- Ingerir os dados
- Verificar o tipo dos dados
- Modificar para o tipo correto
- Criar colunas de data 
- Salvar na camada silver



## Importação das bobliotecas

## SPARK SESSION

In [2]:
from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql import SparkSession
from pyspark.sql import Window


StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 3, Finished, Available, Finished)

In [3]:
spark = SparkSession.builder.appName("TransformationsSilverLayer").getOrCreate()
spark.conf.set("spark.sql.parquet.vorder.enabled", "true")
spark.conf.set("spark.microsoft.delta.optimizeWrite.enabled", "true")
spark.conf.set("spark.microsoft.delta.optimizeWrite.binSize", "1073741824")

StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 4, Finished, Available, Finished)

## Importação dos dados da camada Bronze
Aqui temos algumas opções: 
- Importar do storage do Data Lake (landing zone);
- Importar de um shortcut;
- Importar da Tabela (Table) propagada do Lakehouse no schema broze;

Nos exemplos abaixo usaremos a terceira opção por simplicidade. 


In [8]:
# Usando Variaveis para parametrizar os codigos:

INPUT_LAYER = "bronze"
WORKSPACE_NAME = spark.conf.get("trident.workspace.name")
LAKEHOUSE_NAME = spark.conf.get("trident.lakehouse.name")
OUTPUT_LAYER = "silver"

# Diretorio de saida completo
OUTPUT_PATH = f"abfss://{WORKSPACE_NAME}@onelake.dfs.fabric.microsoft.com/{LAKEHOUSE_NAME}.Lakehouse/Tables/{OUTPUT_LAYER}"
# Diretorio de saida curto
# Tables/{OUTPUT_LAYER}/olist_customers_dataset.csv


StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 9, Finished, Available, Finished)

In [9]:
customers_df = spark.sql(f"SELECT * FROM {LAKEHOUSE_NAME}.{INPUT_LAYER}.olist_customers_dataset")
geolocation_df = spark.sql(f"SELECT * FROM {LAKEHOUSE_NAME}.{INPUT_LAYER}.olist_geolocation_dataset")
order_items_df = spark.sql(f"SELECT * FROM {LAKEHOUSE_NAME}.{INPUT_LAYER}.olist_order_items_dataset")
order_payments_df = spark.sql(f"SELECT * FROM {LAKEHOUSE_NAME}.{INPUT_LAYER}.olist_order_payments_dataset")
order_reviews_df = spark.sql(f"SELECT * FROM {LAKEHOUSE_NAME}.{INPUT_LAYER}.olist_order_reviews_dataset")
orders_df = spark.sql(f"SELECT * FROM {LAKEHOUSE_NAME}.{INPUT_LAYER}.olist_orders_dataset")
products_df = spark.sql(f"SELECT * FROM {LAKEHOUSE_NAME}.{INPUT_LAYER}.olist_products_dataset")
sellers_df = spark.sql(f"SELECT * FROM {LAKEHOUSE_NAME}.{INPUT_LAYER}.olist_sellers_dataset")
product_category_name_translation_df = spark.sql(f"SELECT * FROM {LAKEHOUSE_NAME}.{INPUT_LAYER}.product_category_name_translation")


StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 10, Finished, Available, Finished)

## Mostrar as tabelas da camada bronze
Existem algumas formas de mostrar as tabelas lidas, isso serve para qualquer dataframe.  

```python
dataframe.show()
display(dataframe)
```

## 1 : Exibição das tabelas
Vamos olhar como estão os dados das tabelas para ver quais tipos de alterações podemos realizar.  


In [10]:
# Exibir as 5 primeiras linhas e o esquema do dataframe customers_df
customers_df.show(2,vertical=True,truncate=False)
customers_df.printSchema()

StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 11, Finished, Available, Finished)

-RECORD 0----------------------------------------------------
 customer_id              | b8f26e9e2db5ca6ecf0223a40994e02b 
 customer_unique_id       | 2369cc934990a1abc5b18b5ccf8da1ba 
 customer_zip_code_prefix | 21555                            
 customer_city            | rio de janeiro                   
 customer_state           | RJ                               
-RECORD 1----------------------------------------------------
 customer_id              | 9e226cfa83b3ebd7c9aa0650697ecce2 
 customer_unique_id       | ac6c38dcc2677a2e9510f1d508723883 
 customer_zip_code_prefix | 21740                            
 customer_city            | rio de janeiro                   
 customer_state           | RJ                               
only showing top 2 rows

root
 |-- customer_id: string (nullable = true)
 |-- customer_unique_id: string (nullable = true)
 |-- customer_zip_code_prefix: string (nullable = true)
 |-- customer_city: string (nullable = true)
 |-- customer_state: string (nul

In [11]:
# Exibir as 5 primeiras linhas e o esquema do dataframe geolocation_df
geolocation_df.show(2,vertical=True,truncate=False)
geolocation_df.printSchema()

StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 12, Finished, Available, Finished)

-RECORD 0------------------------------------------
 geolocation_zip_code_prefix | 21330               
 geolocation_lat             | -22.88243323896361  
 geolocation_lng             | -43.358359086588976 
 geolocation_city            | rio de janeiro      
 geolocation_state           | RJ                  
-RECORD 1------------------------------------------
 geolocation_zip_code_prefix | 21330               
 geolocation_lat             | -22.88243323896361  
 geolocation_lng             | -43.358359086588976 
 geolocation_city            | rio de janeiro      
 geolocation_state           | RJ                  
only showing top 2 rows

root
 |-- geolocation_zip_code_prefix: string (nullable = true)
 |-- geolocation_lat: string (nullable = true)
 |-- geolocation_lng: string (nullable = true)
 |-- geolocation_city: string (nullable = true)
 |-- geolocation_state: string (nullable = true)



In [12]:
# Exibir as 5 primeiras linhas e o esquema do dataframe order_items_df
order_items_df.show(2,vertical=True,truncate=False)
order_items_df.printSchema()


StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 13, Finished, Available, Finished)

-RECORD 0-----------------------------------------------
 order_id            | 8272b63d03f5f79c56e9e4120aec44ef 
 order_item_id       | 8                                
 product_id          | 05b515fdc76e888aada3c6d66c201dff 
 seller_id           | 2709af9587499e95e803a6498a5a56e9 
 shipping_limit_date | 2017-07-21 18:25:23              
 price               | 1.20                             
 freight_value       | 7.89                             
-RECORD 1-----------------------------------------------
 order_id            | 8272b63d03f5f79c56e9e4120aec44ef 
 order_item_id       | 9                                
 product_id          | 05b515fdc76e888aada3c6d66c201dff 
 seller_id           | 2709af9587499e95e803a6498a5a56e9 
 shipping_limit_date | 2017-07-21 18:25:23              
 price               | 1.20                             
 freight_value       | 7.89                             
only showing top 2 rows

root
 |-- order_id: string (nullable = true)
 |-- order_item_id

In [13]:
# Exibir as 5 primeiras linhas e o esquema do dataframe order_payments_df
order_payments_df.show(2,vertical=True,truncate=False)
order_payments_df.printSchema()

StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 14, Finished, Available, Finished)

-RECORD 0------------------------------------------------
 order_id             | 931a32b4ca3fdc36741af29ea645d8bf 
 payment_sequential   | 2                                
 payment_type         | boleto                           
 payment_installments | 1                                
 payment_value        | 94.40                            
-RECORD 1------------------------------------------------
 order_id             | d19e65e88e48b013c91b0e77c852d06e 
 payment_sequential   | 2                                
 payment_type         | debit_card                       
 payment_installments | 1                                
 payment_value        | 99.50                            
only showing top 2 rows

root
 |-- order_id: string (nullable = true)
 |-- payment_sequential: string (nullable = true)
 |-- payment_type: string (nullable = true)
 |-- payment_installments: string (nullable = true)
 |-- payment_value: string (nullable = true)



In [14]:
# Exibir as 5 primeiras linhas e o esquema do dataframe order_reviews_df
order_reviews_df.show(2,vertical=True,truncate=False)
order_reviews_df.printSchema()

StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 15, Finished, Available, Finished)

-RECORD 0---------------------------------------------------
 review_id               | e06c059207dad93c6808dd69aad29217 
 order_id                | de6a26eab8b3a87c4d4ec2e2a8c2495e 
 review_score            | 2                                
 review_comment_title    | Farinheiro                       
 review_comment_message  | NULL                             
 review_creation_date    | 2018-06-16 00:00:00              
 review_answer_timestamp | 2018-06-18 11:08:40              
-RECORD 1---------------------------------------------------
 review_id               | 52baca75dbcbb53c69ae3e39e4632675 
 order_id                | e38ff07f7864e8fd4fd51687cba79d89 
 review_score            | 2                                
 review_comment_title    | Médio                            
 review_comment_message  | NULL                             
 review_creation_date    | 2018-07-25 00:00:00              
 review_answer_timestamp | 2018-07-28 03:09:47              
only showing top 2 rows


In [15]:
# Exibir as 5 primeiras linhas e o esquema do dataframe orders_df
orders_df.show(2,vertical=True,truncate=False)
orders_df.printSchema()


StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 16, Finished, Available, Finished)

-RECORD 0---------------------------------------------------------
 order_id                      | 7813842ae95e8c497fc0233232ae815a 
 customer_id                   | 040d94f8ba8ca26014bd6f7e8a6e0c0d 
 order_status                  | canceled                         
 order_purchase_timestamp      | 2018-08-17 20:06:36              
 order_approved_at             | NULL                             
 order_delivered_carrier_date  | NULL                             
 order_delivered_customer_date | NULL                             
 order_estimated_delivery_date | 2018-09-17 00:00:00              
-RECORD 1---------------------------------------------------------
 order_id                      | 5a14c8b3d919a4ef3f3428b0459c47b2 
 customer_id                   | 666094835d60d986eb87350b31efdcae 
 order_status                  | canceled                         
 order_purchase_timestamp      | 2017-05-29 23:53:39              
 order_approved_at             | NULL                         

In [16]:
# Exibir as 5 primeiras linhas e o esquema do dataframe products_df
products_df.show(2,vertical=True,truncate=False)
products_df.printSchema()

StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 17, Finished, Available, Finished)

-RECORD 0------------------------------------------------------
 product_id                 | a41e356c76fab66334f36de622ecbd3a 
 product_category_name      | NULL                             
 product_name_lenght        | NULL                             
 product_description_lenght | NULL                             
 product_photos_qty         | NULL                             
 product_weight_g           | 650                              
 product_length_cm          | 17                               
 product_height_cm          | 14                               
 product_width_cm           | 12                               
-RECORD 1------------------------------------------------------
 product_id                 | d8dee61c2034d6d075997acef1870e9b 
 product_category_name      | NULL                             
 product_name_lenght        | NULL                             
 product_description_lenght | NULL                             
 product_photos_qty         | NULL      

In [17]:
# Exibir as 5 primeiras linhas e o esquema do dataframe sellers_df
sellers_df.show(2,vertical=True,truncate=False)
sellers_df.printSchema()


StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 18, Finished, Available, Finished)

-RECORD 0--------------------------------------------------
 seller_id              | 392f7f2c797e4dc077e4311bde2ab8ce 
 seller_zip_code_prefix | 21210                            
 seller_city            | rio de janeiro                   
 seller_state           | RN                               
-RECORD 1--------------------------------------------------
 seller_id              | 99002261c568a84cce14d43fcffb43ea 
 seller_zip_code_prefix | 78095                            
 seller_city            | cuiaba                           
 seller_state           | MT                               
only showing top 2 rows

root
 |-- seller_id: string (nullable = true)
 |-- seller_zip_code_prefix: string (nullable = true)
 |-- seller_city: string (nullable = true)
 |-- seller_state: string (nullable = true)



In [18]:
# Exibir as 5 primeiras linhas e o esquema do dataframe product_category_name_translation_df
product_category_name_translation_df.show(2,vertical=True,truncate=False)
product_category_name_translation_df.printSchema()

StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 19, Finished, Available, Finished)

-RECORD 0-----------------------------------------------
 product_category_name         | beleza_saude           
 product_category_name_english | health_beauty          
-RECORD 1-----------------------------------------------
 product_category_name         | informatica_acessorios 
 product_category_name_english | computers_accessories  
only showing top 2 rows

root
 |-- product_category_name: string (nullable = true)
 |-- product_category_name_english: string (nullable = true)



## O que modificar no schema? 

Olhando os tipos de dados de saida do printSchema podemos ver quais dados remapear: 

| Dataframe                | Column                        | Output Data Type |
|--------------------------|-------------------------------|------------------|
| products_df              | product_weight_g              | Integer          |
| products_df              | product_length_cm             | Integer          |
| products_df              | product_height_cm             | Integer          |
| products_df              | product_width_cm              | Integer          |
| orders_df                | order_purchase_timestamp      | Timestamp        |
| orders_df                | order_approved_at             | NULL             |
| orders_df                | order_delivered_carrier_date  | NULL             |
| orders_df                | order_delivered_customer_date | NULL             |
| orders_df                | order_estimated_delivery_date | Timestamp        |
| order_reviews_df        | review_score                  | Integer          |
| order_reviews_df        | review_creation_date          | Timestamp        |
| order_reviews_df        | review_answer_timestamp       | Timestamp        |
| order_payments_df        | payment_sequential            | Integer          |
| order_payments_df        | payment_installments          | Integer          |
| order_payments_df        | payment_value                 | Double           |
| order_items_df           | shipping_limit_date           | Timestamp        |
| order_items_df           | price                         | Double           |
| order_items_df           | freight_value                 | Double           |

Porque não passar os dados de zip code prefix para integer?  
Neste caso não passarei para integer devido a não fazer diferença para calculo, são dados de CEP no Brasil onde o CEP completo tem o formato xxxxx-xxx. Outro ponto aqui é que existem CEPs que começam com zero 0, o que faria que os valores ficariam incorretos, um CEP iniciado com zero nao teria 5 caracteres, e sim menos.  
Caso no futuro haja a necessidade de troca basta adicionar mais uma etapa no pipeline ou trocar na saida (PowerBI). 


## Modificando o Schema nas tabelas citadas

In [34]:
# products_df
products_df = products_df.withColumn("product_weight_g", col("product_weight_g").cast(IntegerType())) \
                         .withColumnRenamed("product_name_lenght", "product_name_length") \
                         .withColumnRenamed("product_description_lenght", "product_description_length") \
                         .withColumn("product_length_cm", col("product_length_cm").cast(IntegerType())) \
                         .withColumn("product_height_cm", col("product_height_cm").cast(IntegerType())) \
                         .withColumn("product_photos_qty", col("product_photos_qty").cast(IntegerType())) \
                         .withColumn("product_width_cm", col("product_width_cm").cast(IntegerType()))

# orders_df
orders_df = orders_df.withColumn("order_purchase_timestamp", col("order_purchase_timestamp").cast(TimestampType())) \
                     .withColumn("order_approved_at", col("order_approved_at").cast(TimestampType())) \
                     .withColumn("order_delivered_carrier_date", col("order_delivered_carrier_date").cast(TimestampType())) \
                     .withColumn("order_delivered_customer_date", col("order_delivered_customer_date").cast(TimestampType())) \
                     .withColumn("order_estimated_delivery_date", col("order_estimated_delivery_date").cast(TimestampType()))

# order_reviews_df
order_reviews_df = order_reviews_df.withColumn("review_score", col("review_score").cast(IntegerType())) \
                                   .withColumn("review_creation_date", col("review_creation_date").cast(TimestampType())) \
                                   .withColumn("review_answer_timestamp", col("review_answer_timestamp").cast(TimestampType()))

# order_payments_df
order_payments_df = order_payments_df.withColumn("payment_sequential", col("payment_sequential").cast(IntegerType())) \
                                     .withColumn("payment_installments", col("payment_installments").cast(IntegerType())) \
                                     .withColumn("payment_value", col("payment_value").cast(DoubleType()))

# order_items_df
order_items_df = order_items_df.withColumn("shipping_limit_date", col("shipping_limit_date").cast(TimestampType())) \
                               .withColumn("price", col("price").cast(DoubleType())) \
                               .withColumn("freight_value", col("freight_value").cast(DoubleType()))


StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 35, Finished, Available, Finished)

## Criação das colunas de data para mes,dia,ano e dia da semana.

In [35]:
from pyspark.sql.functions import year, month, dayofmonth, dayofweek

order_items_df = order_items_df.withColumn("order_year", year(col("shipping_limit_date"))) \
                     .withColumn("order_month", month(col("shipping_limit_date"))) \
                     .withColumn("order_day", dayofmonth(col("shipping_limit_date"))) \
                     .withColumn("order_weekday", dayofweek(col("shipping_limit_date")))

order_reviews_df = order_reviews_df.withColumn("order_review_year", year(col("review_creation_date")))\
                                    .withColumn("order_review_month", month(col("review_creation_date")))\
                                    .withColumn("order_review_day", dayofmonth(col("review_creation_date")))\
                                    .withColumn("review_answer_year", year(col("review_answer_timestamp")))\
                                    .withColumn("review_answer_month", month(col("review_answer_timestamp")))\
                                    .withColumn("review_answer_day", dayofmonth(col("review_answer_timestamp")))

orders_df = orders_df.withColumn("order_purchase_year", year(col("order_purchase_timestamp")))\
                        .withColumn("order_purchase_month", month(col("order_purchase_timestamp")))\
                        .withColumn("order_purchase_day", dayofmonth(col("order_purchase_timestamp")))\
                        .withColumn("order_approved_year", year(col("order_approved_at")))\
                        .withColumn("order_approved_month", month(col("order_approved_at")))\
                        .withColumn("order_approved_day", dayofmonth(col("order_approved_at")))\
                        .withColumn("order_delivered_carrier_year", year(col("order_delivered_carrier_date")))\
                        .withColumn("order_delivered_carrier_month", month(col("order_delivered_carrier_date")))\
                        .withColumn("order_delivered_carrier_day", dayofmonth(col("order_delivered_carrier_date")))\
                        .withColumn("order_delivered_customer_year", year(col("order_delivered_customer_date")))\
                        .withColumn("order_delivered_customer_month", month(col("order_delivered_customer_date")))\
                        .withColumn("order_delivered_customer_day", dayofmonth(col("order_delivered_customer_date")))\
                        .withColumn("order_estimated_delivery_year", year(col("order_estimated_delivery_date")))\
                        .withColumn("order_estimated_delivery_month", month(col("order_estimated_delivery_date")))\
                        .withColumn("order_estimated_delivery_day", dayofmonth(col("order_estimated_delivery_date")))\








StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 36, Finished, Available, Finished)

## Remoção de Valores Nulos

In [36]:
customers_df = customers_df.fillna({"customer_zip_code_prefix": "0", 
                                    "customer_city": "Unknown",
                                    "customer_state": "Unknown",
                                    "customer_unique_id": "Unknown",
                                    "customer_id": "Unknown"
                                    })

geolocation_df = geolocation_df.fillna(
    {
        "geolocation_zip_code_prefix": "0",
        "geolocation_lat": "0",
        "geolocation_lng": "0",
        "geolocation_city": "Unknown",
        "geolocation_state": "Unknown"
    }
)

order_items_df = order_items_df.fillna({
    "order_id": "Unknown",
    "order_item_id": "0",
    "product_id": "Unknown",
    "seller_id": "Unknown",
    "shipping_limit_date": "Unknown",
    "price": 0.0,
    "freight_value": 0.0
})

order_payments_df = order_payments_df.fillna(
    {
        "order_id": "Unknown",
        "payment_sequential": 0,
        "payment_type": "Unknown",
        "payment_installments": 0,
        "payment_value": 0.0
})

order_reviews_df = order_reviews_df.fillna(
    {
        "review_id": "Unknown",
        "order_id": "Unknown",
        "review_score": 0,
        "review_comment_title": "Unknown",
        "review_comment_message": "Unknown",
        "review_creation_date": "Unknown",
        "review_answer_timestamp": "Unknown"
})


orders_df = orders_df.fillna(
    {
        "order_id": "Unknown",
        "customer_id": "Unknown",
        "order_status": "Unknown",
        "order_purchase_timestamp": "Unknown",
        "order_approved_at": "Unknown",
        "order_delivered_carrier_date": "Unknown",
        "order_delivered_customer_date": "Unknown",
        "order_estimated_delivery_date": "Unknown",
        "order_purchase_year": "Unknown",
        "order_purchase_month": "Unknown",
        "order_purchase_day": "Unknown",
        "order_approved_year": "Unknown",
        "order_approved_month": "Unknown",
        "order_approved_day": "Unknown",
        "order_delivered_carrier_year": "Unknown",
        "order_delivered_carrier_month": "Unknown",
        "order_delivered_carrier_day": "Unknown",
        "order_delivered_customer_year": "Unknown",
        "order_delivered_customer_month": "Unknown",
        "order_delivered_customer_day": "Unknown",
        "order_estimated_delivery_year": "Unknown",
        "order_estimated_delivery_month": "Unknown",
        "order_estimated_delivery_day": "Unknown",

})

products_df = products_df.fillna(
{
    "product_id": "Unknown",
    "product_category_name": "Unknown",
    "product_name_length": "0",
    "product_description_length": "0",
    "product_photos_qty": "0",
    "product_weight_g": 0,
    "product_length_cm": 0,
    "product_height_cm": 0,
    "product_width_cm": 0

})

sellers_df = sellers_df.fillna(
{
    "seller_id": "Unknown",
    "seller_zip_code_prefix": "0",
    "seller_city": "Unknown",
    "seller_state": "Unknown"
})

product_category_name_translation_df = product_category_name_translation_df.fillna(
 {
     "product_category_name": "Unknown",
     "product_category_name_english": "Unknown"
 })   

StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 37, Finished, Available, Finished)

## Salvando as tabelas transformadas na camada Silver

Agora basta lançar as tabelas na camada silver para poder usar elas posteriormente na criação das camadas de negócios.  

In [38]:

# Salvar customers_df na camada Silver
customers_df.write.format("delta")\
                  .mode("overwrite")\
                  .option("path", f"{OUTPUT_PATH}/customers_silver")\
                  .save()

# Salvar geolocation_df na camada Silver
geolocation_df.write.format("delta")\
                    .mode("overwrite")\
                    .option("path", f"{OUTPUT_PATH}/geolocation_silver")\
                    .save()

# Salvar order_items_df na camada Silver
order_items_df.write.format("delta")\
                    .mode("overwrite")\
                    .option("path", f"{OUTPUT_PATH}/order_items_silver")\
                    .save()

# Salvar order_payments_df na camada Silver
order_payments_df.write.format("delta")\
                       .mode("overwrite")\
                       .option("path", f"{OUTPUT_PATH}/order_payments_silver")\
                       .save()

# Salvar order_reviews_df na camada Silver
order_reviews_df.write.format("delta")\
                      .mode("overwrite")\
                      .option("path", f"{OUTPUT_PATH}/order_reviews_silver")\
                      .save()

# Salvar orders_df na camada Silver
orders_df.write.format("delta")\
               .mode("overwrite")\
               .option("path", f"{OUTPUT_PATH}/orders_silver")\
               .save()

# Salvar products_df na camada Silver
products_df.write.format("delta")\
                 .mode("overwrite")\
                 .option("path", f"{OUTPUT_PATH}/products_silver")\
                 .save()

# Salvar sellers_df na camada Silver
sellers_df.write.format("delta")\
                .mode("overwrite")\
                .option("path", f"{OUTPUT_PATH}/sellers_silver")\
                .save()

# Salvar product_category_name_translation_df na camada Silver
product_category_name_translation_df.write.format("delta")\
                                          .mode("overwrite")\
                                          .option("path", f"{OUTPUT_PATH}/product_category_name_translation_silver")\
                                          .save()


StatementMeta(, 9379e2ec-9b72-4416-88e8-96c18af1123b, 39, Finished, Available, Finished)