# Projeto Final | Big Data
----
**Desenvolvimento e Avaliação de uma Arquitetura Distribuída para um Relatório de Saldo Mensal da Conta**

## Load Functions and variables

Funções para testar a qualidade dos dados (Great Expectations)

In [0]:
%run ./modules/data-quality

Python interpreter will be restarted.
Collecting great-expectations
  Downloading great_expectations-0.18.13-py3-none-any.whl (5.4 MB)
Collecting marshmallow<4.0.0,>=3.7.1
  Downloading marshmallow-3.21.2-py3-none-any.whl (49 kB)
Collecting makefun<2,>=1.7.0
  Downloading makefun-1.15.2-py2.py3-none-any.whl (22 kB)
Collecting tzlocal>=1.2
  Downloading tzlocal-5.2-py3-none-any.whl (17 kB)
Collecting numpy<2.0.0,>=1.21.6
  Downloading numpy-1.26.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
Collecting colorama>=0.4.3
  Downloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Collecting jsonpatch>=1.22
  Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting altair<5.0.0,>=4.2.1
  Downloading altair-4.2.2-py3-none-any.whl (813 kB)
Collecting pydantic>=1.9.2
  Downloading pydantic-2.7.1-py3-none-any.whl (409 kB)
Collecting ruamel.yaml<0.17.18,>=0.16
  Downloading ruamel.yaml-0.17.17-py3-none-any.whl (109 kB)
Collecting notebook>=6.4.10
  Downloading noteb

Loading function *create_path* to create folders

In [0]:
%run ./modules/utils

Strings json para criação do schema dos campos atráves da StructType (Bronze e Silver)

In [0]:
%run ./modules/json_strings

In [0]:
%run ./modules/json_strings_silver

## Load bibs

In [0]:
import zipfile
import os
import json

from pyspark.sql.types import StringType, StructType, StructField, IntegerType, DecimalType, LongType, DataType, TimestampType, DoubleType
from pyspark.sql.functions import expr, last_day, col, min, max, to_date, current_date, sum, lit, concat, lpad

## Load Paths and create dirs

In [0]:
source_path = '/FileStore/project_report_balance/'

landing_path = source_path + 'landing/'
bronze_path = source_path + 'bronze/'
silver_path = source_path + 'silver/'
gold_path = source_path + 'gold/'

path_list = [source_path, landing_path, bronze_path, silver_path, gold_path]

In [0]:
dbutils.fs.rm(bronze_path, True)
dbutils.fs.rm(silver_path, True)
dbutils.fs.rm(gold_path, True)

Out[13]: True

In [0]:
# create dirs
for path in path_list:
    create_path(path)


 A pasta project_report_balance foi criada.

 A pasta landing foi criada.

 A pasta bronze foi criada.

 A pasta silver foi criada.

 A pasta gold foi criada.


## Landing Zone

In [0]:
dir_path_list = ['accounts', 'city', 'country', 'customers', 'd_month', 'd_time', 'd_week', 'd_weekday', 'd_year', 'pix_movements', 'state', 'transfer_ins', 'transfer_outs']

In [0]:
for dir_path in dir_path_list:
    dbutils.fs.mkdirs(landing_path + dir_path)

## Bronze layer

In [0]:
## Carregamento dos dados da camada Bronze em parquet com schema definido 

for dir_name, json_str in zip(dir_path_list, json_str_list):
    print(f'Criando dir {dir_name} na camada Bronze')

    dir_path = landing_path + dir_name
    csv_file_path = [arquivos.path for arquivos in dbutils.fs.ls(dir_path) if arquivos.name.endswith(('.csv', '.CSV'))]

    print(f'Salvando dados em parquet no dir {dir_name} com schema definido')

    # Loading json schema to create tables
    schema_json = StructType.fromJson(json.loads(json_str))

    df_csv = (spark.read.csv(csv_file_path[0], sep=',',header=True, schema = schema_json))

    path_dir_bronze = bronze_path + dir_name
    (df_csv
        .write
        .option("compression","snappy")
        .mode("overwrite")
        .parquet(path_dir_bronze))
    print('Dados salvos! \n')

Criando dir accounts na camada Bronze
Salvando dados em parquet no dir accounts com schema definido
Dados salvos! 

Criando dir city na camada Bronze
Salvando dados em parquet no dir city com schema definido
Dados salvos! 

Criando dir country na camada Bronze
Salvando dados em parquet no dir country com schema definido
Dados salvos! 

Criando dir customers na camada Bronze
Salvando dados em parquet no dir customers com schema definido
Dados salvos! 

Criando dir d_month na camada Bronze
Salvando dados em parquet no dir d_month com schema definido
Dados salvos! 

Criando dir d_time na camada Bronze
Salvando dados em parquet no dir d_time com schema definido
Dados salvos! 

Criando dir d_week na camada Bronze
Salvando dados em parquet no dir d_week com schema definido
Dados salvos! 

Criando dir d_weekday na camada Bronze
Salvando dados em parquet no dir d_weekday com schema definido
Dados salvos! 

Criando dir d_year na camada Bronze
Salvando dados em parquet no dir d_year com schema d

## Silver layer

#### Data Quality (Great Expectations)

###### Verificação de tipo

In [0]:
tables_type = 'accounts'

# Carregando dataframe
table_path = bronze_path + tables_type
df_table = spark.read.parquet(table_path) 

# Converter o DataFrame Spark em um DataFrame Great Expectations
ge_df_table = SparkDFDataset(df_table)

# Verificando os tipos das colunas
print(f'Tabela analisada: {tables_type} - COLUNAS DATETIME')
colunas_datetime = ['created_at']
verifica_colunas_datetime(ge_df_table, colunas_datetime)

Tabela analisada: accounts - COLUNAS DATETIME
A coluna 'created_at' é válida. (Tipo e Valor)


###### Verificação de colunas categóricas

In [0]:
tables_cat = {
    'accounts': {
        'status': ["active", 'inactive']
    }, 
    'pix_movements': {
        'status': ["failed", 'completed'],
        'in_or_out': ["pix_in", 'pix_out']
    }
}

In [0]:
for table, cols in tables_cat.items():
    print(f'>>> Tabela analisada: {table}')
    # Carregando dataframe
    table_path = bronze_path + table
    df_table = spark.read.parquet(table_path) 

    # Converter o DataFrame Spark em um DataFrame Great Expectations
    ge_df_table = SparkDFDataset(df_table)

    for col_analisada, valores_esperados in cols.items():
        verificar_colunas_categoricas(ge_df_table, col_analisada, valores_esperados)

>>> Tabela analisada: accounts
Os valores da coluna status contêm apenas valores esperados. (['active', 'inactive'])
>>> Tabela analisada: pix_movements
Os valores da coluna status contêm apenas valores esperados. (['failed', 'completed'])
Os valores da coluna in_or_out contêm apenas valores esperados. (['pix_in', 'pix_out'])


###### Verificação de colunas ID

In [0]:
tables_id = {
    'accounts': ['account_id', 'customer_id']
    , 'city': ["state_id","city_id"]
    , 'country': ["country_id"]
    , 'customers': ["customer_id"] 
    , 'pix_movements': ["id",'account_id']
}

In [0]:
for table, id_cols_list in tables_id.items():
    print(f'>>> Tabela analisada: {table}')
    # Carregando dataframe
    table_path = bronze_path + table
    df_table = spark.read.parquet(table_path) 

    # Converter o DataFrame Spark em um DataFrame Great Expectations
    ge_df_table = SparkDFDataset(df_table)

    verificar_colunas_id(ge_df_table, id_cols_list)

>>> Tabela analisada: accounts
A coluna 'account_id' é válida. (Tipo e Valor)
A coluna 'customer_id' é válida. (Tipo e Valor)
>>> Tabela analisada: city
A coluna 'state_id' é válida. (Tipo e Valor)
A coluna 'city_id' é válida. (Tipo e Valor)
>>> Tabela analisada: country
A coluna 'country_id' é válida. (Tipo e Valor)
>>> Tabela analisada: customers
A coluna 'customer_id' é válida. (Tipo e Valor)
>>> Tabela analisada: pix_movements
A coluna 'id' é válida. (Tipo e Valor)
A coluna 'account_id' é válida. (Tipo e Valor)


###### Verificação de colunas não vazias

In [0]:
tables_non_empty  = {
    'accounts': ['account_id', 'customer_id', 'account_branch', 'account_check_digit', 'account_number']
    , 'city': ['city']
    , 'country': ['country']
    , 'customers': ['first_name', 'last_name', 'country_name', 'customer_city','cpf']
    , 'pix_movements': ['account_id', 'id', 'pix_amount', 'pix_requested_at','pix_completed_at']
}

In [0]:
for table, cols_no_empty in tables_non_empty.items():
    print(f'>>> Tabela analisada: {table}')
    # Carregando dataframe
    table_path = bronze_path + table
    df_table = spark.read.parquet(table_path) 

    # Converter o DataFrame Spark em um DataFrame Great Expectations
    ge_df_table = SparkDFDataset(df_table)

    verificar_colunas_com_none(ge_df_table, cols_no_empty)

>>> Tabela analisada: accounts
A coluna 'account_id' é válida. (Não tem valores nulos)
A coluna 'customer_id' é válida. (Não tem valores nulos)
A coluna 'account_branch' é válida. (Não tem valores nulos)
A coluna 'account_check_digit' é válida. (Não tem valores nulos)
A coluna 'account_number' é válida. (Não tem valores nulos)
>>> Tabela analisada: city
A coluna 'city' é válida. (Não tem valores nulos)
>>> Tabela analisada: country
A coluna 'country' é válida. (Não tem valores nulos)
>>> Tabela analisada: customers
A coluna 'first_name' é válida. (Não tem valores nulos)
A coluna 'last_name' é válida. (Não tem valores nulos)
A coluna 'country_name' é válida. (Não tem valores nulos)
A coluna 'customer_city' é válida. (Não tem valores nulos)
A coluna 'cpf' é válida. (Não tem valores nulos)
>>> Tabela analisada: pix_movements
A coluna 'account_id' é válida. (Não tem valores nulos)
A coluna 'id' é válida. (Não tem valores nulos)
A coluna 'pix_amount' é válida. (Não tem valores nulos)
A colu

###### Verificação de valores MIN e MAX

In [0]:
tables_min_max  = {
    'customers': [['cpf'],  0, 99999999999]
    , 'pix_movements': [['pix_amount'],  0, 10000]
}

In [0]:
for table, values in tables_min_max.items():
    print(f'>>> Tabela analisada: {table}')
    # Carregando dataframe
    table_path = bronze_path + table
    df_table = spark.read.parquet(table_path) 

    # Converter o DataFrame Spark em um DataFrame Great Expectations
    ge_df_table = SparkDFDataset(df_table)

    list_of_columns, minimo, maximo = values[0], values[1], values[2]
    verificar_valores_min_max(ge_df_table, list_of_columns, minimo, maximo)


>>> Tabela analisada: customers
A coluna 'cpf' está entre os valores de 0 e 99999999999.
>>> Tabela analisada: pix_movements
A coluna 'pix_amount' está entre os valores de 0 e 10000.


#### Creating Silver layer

In [0]:
%sql
CREATE SCHEMA IF NOT EXISTS silver LOCATION '/FileStore/project_report_balance/silver'

#### Creating Silver tables with StructType object from JSON file

In [0]:
lista_df_silver = {}

In [0]:
for dir_name, json_str in zip(dir_path_list, json_str_list):
    print('Criando dataframe: ' + dir_name)

    path_dir_bronze = bronze_path + dir_name

    # Loading json schema to create tables
    schema_json = StructType.fromJson(json.loads(json_str))

    df_parquet = (spark.read.parquet(path_dir_bronze, sep = ',', header = True, schema = schema_json))
    lista_df_silver[dir_name] = df_parquet

    df_parquet.createOrReplaceTempView(dir_name)


Criando dataframe: accounts
Criando dataframe: city
Criando dataframe: country
Criando dataframe: customers
Criando dataframe: d_month
Criando dataframe: d_time
Criando dataframe: d_week
Criando dataframe: d_weekday
Criando dataframe: d_year
Criando dataframe: pix_movements
Criando dataframe: state
Criando dataframe: transfer_ins
Criando dataframe: transfer_outs


#### Creating Table silver.d_accounts

In [0]:
df_accounts = spark.read.table('accounts')
df_accounts = df_accounts.withColumn('account', concat(col('account_branch'),col('account_check_digit'), col('account_number')))
df_accounts = df_accounts.select(['account_id', 'status','account','created_at'])


schema_json = StructType.fromJson(json.loads(json_str_list_silver[0]))

path_dir_silver_d_account = silver_path + 'd_account'

(df_accounts
    .write
    .saveAsTable('silver.d_accounts', compression = "snappy", mode = "overwrite", path = path_dir_silver_d_account, schema = schema_json)
)

#### Creating Table silver.f_movements


In [0]:
df_d_time = spark.read.table('d_time').select(['time_id', 'action_timestamp'])
df_d_time = df_d_time.withColumn('ultimo_dia_mes', last_day('action_timestamp'))
df_d_time = df_d_time.select(['time_id', 'ultimo_dia_mes'])    

# pix
df_pix = spark.read.table('pix_movements').select(['account_id','pix_amount','in_or_out','status','pix_requested_at', 'pix_completed_at'])
df_pix = df_pix.join(df_d_time.alias('d_time').withColumnRenamed('ultimo_dia_mes','requested_at')
            , col('d_time.time_id') == col('pix_movements.pix_requested_at')
            , 'inner'
    ).join(df_d_time.alias('d_time_completed').withColumnRenamed('ultimo_dia_mes','completed_at')
            , col('d_time_completed.time_id') == col('pix_movements.pix_completed_at')
            , 'left'
    )
df_pix = df_pix.withColumnRenamed('pix_amount','amount')
df_pix = df_pix.select(['account_id','amount', 'in_or_out', 'status','requested_at','completed_at'])

#transfer in
df_transfer_ins = spark.read.table('transfer_ins').select(['account_id','amount','status','transaction_requested_at', 'transaction_completed_at'])
df_transfer_ins = df_transfer_ins.withColumn('in_or_out', lit('transfer_in'))
df_transfer_ins = df_transfer_ins.join(df_d_time.alias('d_time').withColumnRenamed('ultimo_dia_mes','requested_at')
            , col('d_time.time_id') == col('transfer_ins.transaction_requested_at')
            , 'inner'
    ).join(df_d_time.alias('d_time_completed').withColumnRenamed('ultimo_dia_mes','completed_at')
            , col('d_time_completed.time_id') == col('transfer_ins.transaction_completed_at')
            , 'left'
    )
df_transfer_ins = df_transfer_ins.select(['account_id','amount', 'in_or_out', 'status','requested_at','completed_at'])

# transfer outs
df_transfer_outs = spark.read.table('transfer_outs').select(['account_id','amount','status','transaction_requested_at', 'transaction_completed_at'])
df_transfer_outs = df_transfer_outs.withColumn('in_or_out', lit('transfer_out'))
df_transfer_outs = df_transfer_outs.join(df_d_time.alias('d_time').withColumnRenamed('ultimo_dia_mes','requested_at')
            , col('d_time.time_id') == col('transfer_outs.transaction_requested_at')
            , 'inner'
    ).join(df_d_time.alias('d_time_completed').withColumnRenamed('ultimo_dia_mes','completed_at')
            , col('d_time_completed.time_id') == col('transfer_outs.transaction_completed_at')
            , 'left'
    )

df_transfer_outs = df_transfer_outs.select(['account_id','amount', 'in_or_out', 'status','requested_at','completed_at'])

df_f_movements = df_pix.unionAll(df_transfer_ins).unionAll(df_transfer_outs)

schema_json_f_movements = StructType.fromJson(json.loads(json_str_list_silver[1]))
path_dir_silver_f_movements= silver_path + 'f_movements'

(df_f_movements
    .write
    .saveAsTable('silver.f_movements', compression = "snappy", mode = "overwrite", path = path_dir_silver_f_movements, schema = schema_json_f_movements)
)

#### Creating Table silver.d_calendar

In [0]:
d_year_df = spark.read.table("d_year").drop(col("year_id")).dropDuplicates(['action_year'])
d_month_df = spark.read.table("d_month").drop(col("month_id")).dropDuplicates(['action_month'])
d_calendar_df = d_year_df.join(d_month_df)
d_calendar_df = d_calendar_df.withColumnRenamed("action_year","year").withColumnRenamed("action_month","month")

d_calendar_df = d_calendar_df.withColumn("ultimo_dia_mes", last_day(to_date(concat(d_calendar_df["year"], lit("-"), lpad(d_calendar_df["month"],2,"0"), lit("-"), lit("01")), "yyyy-MM-dd"))) 

d_calendar_df = d_calendar_df.alias("d_calendar")
schema_df_d_calendar = StructType([
    StructField("year", IntegerType(), False)
    , StructField("month", IntegerType(), False)
    , StructField("ultimo_dia_mes", DataType(), False)
])

d_calendar_path = silver_path + 'd_calendar'

(d_calendar_df
    .write
    .saveAsTable('silver.d_calendar', compression="snappy", mode="overwrite", path=d_calendar_path, schema=schema_df_d_calendar)
)

In [0]:
%sql
SELECT * FROM silver.d_calendar LIMIT 5

year,month,ultimo_dia_mes
2020,12,2020-12-31
2020,1,2020-01-31
2020,6,2020-06-30
2020,3,2020-03-31
2020,5,2020-05-31


## Gold layer

### Tabela/View 1: Lancamentos por mes 

In [0]:
# Tabela/View 1 - Lancamentos por mes
# Consulta ira criar uma tabela/view que vai apresentar as soma das entradas e saidas por mês e conta.
sql_lancamentos_por_mes = " \
   SELECT \
      account_id \
      , ultimo_dia_mes \
      , SUM(valor_saida) AS VALOR_SAIDA \
      , SUM(valor_entrada) AS VALOR_ENTRADA \
   FROM ( \
      SELECT account_id \
         , requested_at ultimo_dia_mes \
         , SUM(amount) valor_saida \
         , 0 valor_entrada \
      FROM silver.f_movements \
      WHERE status = 'completed' \
         and in_or_out in ('pix_out', 'transfer_out') \
      GROUP BY account_id \
         , requested_at \
      UNION ALL \
      SELECT account_id \
         , completed_at ultimo_dia_mes \
         , 0 valor_saida \
         , SUM(amount) valor_entrada \
      FROM silver.f_movements \
      WHERE status = 'completed' \
         and in_or_out in ('pix_in', 'transfer_in') \
         and completed_at is not NULL \
      GROUP BY account_id \
         , completed_at \
   ) lancamentos_por_mes \
   GROUP BY account_id, ultimo_dia_mes \
"
 
df_lancamentos_por_mes = spark.sql(sql_lancamentos_por_mes)
df_lancamentos_por_mes.createOrReplaceTempView("lancamentos_por_mes")


### Tabela/View 2: Total Por Mes

In [0]:
   # Tabela/View 2 - Total Por Mes
   # Essa consulta sql vai gerar uma tabela/view que vai listar a soma de todas as entrada e saida até o mes de analise.
   sql_total_por_mes = "\
      SELECT \
         d_calendar.ultimo_dia_mes As ultimo_dia_mes \
         , lancamentos_por_mes.account_id \
         , SUM(lancamentos_por_mes.VALOR_ENTRADA) TOTAL_ENTRADA \
         , SUM(lancamentos_por_mes.VALOR_SAIDA) TOTAL_SAIDA  \
          \
      FROM silver.d_calendar \
         LEFT JOIN lancamentos_por_mes  \
            ON lancamentos_por_mes.ultimo_dia_mes <= d_calendar.ultimo_dia_mes \
 \
      GROUP BY d_calendar.ultimo_dia_mes \
         , lancamentos_por_mes.account_id \
   "

df_total_por_mes =  spark.sql(sql_total_por_mes)    
df_total_por_mes.createOrReplaceTempView('total_por_mes') 

### Tabela/View 3: Acumulado por mes

In [0]:
# Tabela/View 3 - Acumulado por mes
# Essa tabela apresenta o join das duas tabelas anteriores, e um calculo de saldo final por mes e conta.
sql_acumulado_por_mes = "\
    SELECT \
        d_accounts.account_id \
        , d_calendar.ultimo_dia_mes As ultimo_dia_mes \
        , d_calendar.month mes \
        , d_calendar.year ano \
        , lancamentos_por_mes.VALOR_ENTRADA TOTAL_ENTRADA \
        , lancamentos_por_mes.VALOR_SAIDA TOTAL_SAIDA \
        , COALESCE(total_por_mes.TOTAL_ENTRADA,0) - COALESCE(total_por_mes.TOTAL_SAIDA,0) AS SALDO_FINAL \
    FROM silver.d_accounts \
    INNER JOIN silver.d_calendar \
        ON d_calendar.ultimo_dia_mes >= TO_DATE(d_accounts.created_at, 'yyyy-MM-dd') \
    LEFT JOIN lancamentos_por_mes \
        ON lancamentos_por_mes.account_id = d_accounts.account_id \
        AND lancamentos_por_mes.ultimo_dia_mes = d_calendar.ultimo_dia_mes \
    LEFT JOIN total_por_mes \
        ON total_por_mes.account_id = d_accounts.account_id \
        AND total_por_mes.ultimo_dia_mes = d_calendar.ultimo_dia_mes \
"
df_acumulado_por_mes = spark.sql(sql_acumulado_por_mes)
df_acumulado_por_mes.createOrReplaceTempView("acumulado_por_mes")

### Tabela/View Final: Saldo por mensal

In [0]:
# Tabela Final - Saldo Mensal - agg_saldo_mensal
# Esse select vai aprensentar o saldo mensal final de cada conta e as entradas e saida de cada mês.
sql_saldo_mensal = "\
   SELECT \
       d_accounts.account_id \
       , d_accounts.account \
       , acumulado_por_mes.ultimo_dia_mes \
       , acumulado_por_mes.mes \
       , acumulado_por_mes.ano \
       , FORMAT_NUMBER(coalesce(acumulado_por_mes.total_entrada,0), 2) total_entrada \
       , FORMAT_NUMBER(coalesce(acumulado_por_mes.total_saida,0), 2) total_saida \
       , FORMAT_NUMBER(coalesce(acumulado_por_mes.saldo_final,0), 2) saldo_final \
   FROM silver.d_accounts \
      LEFT JOIN acumulado_por_mes \
         ON acumulado_por_mes.account_id = d_accounts.account_id \
   "
df_agg_saldo_mensal = spark.sql(sql_saldo_mensal)
df_agg_saldo_mensal = df_agg_saldo_mensal.alias('saldo_mensal')



#### Salvando o dataframe em tabela

In [0]:
%sql
CREATE SCHEMA IF NOT EXISTS gold LOCATION '/FileStore/project_report_balance/gold'

In [0]:
schema_gold_agg_saldo_mensal = StructType([
    StructField("account_id", LongType(), False),
    StructField("account", StringType(), False),    
    StructField("mes", IntegerType(), False), 
    StructField("ano", IntegerType(), False), 
    StructField("ultimo_dia_mes", DataType(), False),
    StructField("valor_entrada", DoubleType(), False),
    StructField("valor_saida", DoubleType(), False),
    StructField("saldo_final", DoubleType(), False)
])


agg_saldo_mensal_path = gold_path + 'agg_saldo_mensal'
(df_agg_saldo_mensal
    .write
    .saveAsTable('gold.agg_saldo_mensal', compression="snappy", mode="overwrite", path=agg_saldo_mensal_path, schema=schema_gold_agg_saldo_mensal)
)

In [0]:
%sql 
SELECT * FROM gold.agg_saldo_mensal LIMIT 100

account_id,account,ultimo_dia_mes,mes,ano,total_entrada,total_saida,saldo_final
3138902864818696704,7288945713,2020-02-29,2,2020,1258.88,0.0,1258.88
3138902864818696704,7288945713,2020-11-30,11,2020,0.0,0.0,1606.62
3138902864818696704,7288945713,2020-10-31,10,2020,347.74,0.0,1606.62
3138902864818696704,7288945713,2020-07-31,7,2020,0.0,0.0,1258.88
3138902864818696704,7288945713,2020-08-31,8,2020,0.0,0.0,1258.88
3138902864818696704,7288945713,2020-04-30,4,2020,0.0,0.0,1258.88
3138902864818696704,7288945713,2020-09-30,9,2020,0.0,0.0,1258.88
3138902864818696704,7288945713,2020-05-31,5,2020,0.0,0.0,1258.88
3138902864818696704,7288945713,2020-03-31,3,2020,0.0,0.0,1258.88
3138902864818696704,7288945713,2020-06-30,6,2020,0.0,0.0,1258.88


### Exemplos de Accounts e Saldo Mensal

#### Account com algum registro de entrada igual a zero

In [0]:
%sql 
SELECT * FROM gold.agg_saldo_mensal where account_id = 1910868644230470 order by account_id, ultimo_dia_mes

account_id,account,ultimo_dia_mes,mes,ano,total_entrada,total_saida,saldo_final
1910868644230470,5355495934,2020-01-31,1,2020,6149.17,0.0,6149.17
1910868644230470,5355495934,2020-02-29,2,2020,5892.9,1271.45,10770.62
1910868644230470,5355495934,2020-03-31,3,2020,3617.12,3023.6,11364.14
1910868644230470,5355495934,2020-04-30,4,2020,8191.25,1869.75,17685.64
1910868644230470,5355495934,2020-05-31,5,2020,3055.01,5386.48,15354.17
1910868644230470,5355495934,2020-06-30,6,2020,5681.79,2474.06,18561.9
1910868644230470,5355495934,2020-07-31,7,2020,1595.52,1279.43,18877.99
1910868644230470,5355495934,2020-08-31,8,2020,8438.21,7065.54,20250.66
1910868644230470,5355495934,2020-09-30,9,2020,0.0,2353.51,17897.15
1910868644230470,5355495934,2020-10-31,10,2020,1787.24,2125.77,17558.62


#### Account com algum mes com entrada e saida zerado, ou seja, sem movimentacao no mês

In [0]:
%sql 
SELECT * FROM gold.agg_saldo_mensal where account_id = 100642855136823056 order by account_id, ultimo_dia_mes

account_id,account,ultimo_dia_mes,mes,ano,total_entrada,total_saida,saldo_final
100642855136823056,5850552478,2020-01-31,1,2020,0.0,0.0,0.0
100642855136823056,5850552478,2020-02-29,2,2020,1328.13,1694.06,-365.93
100642855136823056,5850552478,2020-03-31,3,2020,0.0,0.0,-365.93
100642855136823056,5850552478,2020-04-30,4,2020,0.0,0.0,-365.93
100642855136823056,5850552478,2020-05-31,5,2020,858.33,1290.73,-798.33
100642855136823056,5850552478,2020-06-30,6,2020,0.0,0.0,-798.33
100642855136823056,5850552478,2020-07-31,7,2020,0.0,0.0,-798.33
100642855136823056,5850552478,2020-08-31,8,2020,3247.11,290.35,2158.43
100642855136823056,5850552478,2020-09-30,9,2020,1937.54,0.0,4095.97
100642855136823056,5850552478,2020-10-31,10,2020,0.0,29.37,4066.6


#### Account com algum mês com saldo negativo

In [0]:
%sql 
SELECT * FROM gold.agg_saldo_mensal where account_id = 1972174676324008704 order by account_id, ultimo_dia_mes

account_id,account,ultimo_dia_mes,mes,ano,total_entrada,total_saida,saldo_final
1972174676324008704,5231336687,2020-01-31,1,2020,1029.0,0.0,1029.0
1972174676324008704,5231336687,2020-02-29,2,2020,284.45,1408.21,-94.76
1972174676324008704,5231336687,2020-03-31,3,2020,2915.78,0.0,2821.02
1972174676324008704,5231336687,2020-04-30,4,2020,1527.8,1274.08,3074.74
1972174676324008704,5231336687,2020-05-31,5,2020,4711.01,517.33,7268.42
1972174676324008704,5231336687,2020-06-30,6,2020,2934.85,602.66,9600.61
1972174676324008704,5231336687,2020-07-31,7,2020,0.0,0.0,9600.61
1972174676324008704,5231336687,2020-08-31,8,2020,4273.19,0.0,13873.8
1972174676324008704,5231336687,2020-09-30,9,2020,3437.29,0.0,17311.09
1972174676324008704,5231336687,2020-10-31,10,2020,753.67,2282.09,15782.67
