# Getting the brazilians funds cadastral information

**What ?** Brazilian funds are regulated by a governmental institution called CVM (Comissão de Valores Mobiliários). Therefore, every Brazilian fund that is regulated needs to register with this entity, providing information such as name, total assets under management, mandate, administrator, manager, address, asset classes, among others. This cadastral information is publicly available on the CVM webpage.


**Why ?** With this cadastral information, it is possible to perform aggregate analysis of fund sectors and their characteristics. Additionally, it is possible to consolidate funds by management firm to identify management patterns within a group of funds.

**How ?** In this code, we will download the cadastral .csv file from the CVM [webpage](https://dados.cvm.gov.br/dataset/fi-cad) containing information about every Brazilian fund that is regulated. This file is updated from Tuesday to Saturday on the webpage and is already structured as a database. Finally, the downloaded dataframe will be uploaded to a local SQLite database for further analysis.


<img src="https://lh3.googleusercontent.com/d/1KwJrcsJvAZPLej-SsJAlBTr1E6X5c80K" alt="icon_cadastral_info" width="300" align="center">

### Import Libraries

In [1]:
import pandas as pd
import csv
import os
import tempfile
import io

import requests
import zipfile
import sqlite3
from sqlite3 import Error

pd.options.display.float_format = '{:.4f}'.format #formating data displayed

### Download cadastral information from CVM website

In [2]:
url = 'https://dados.cvm.gov.br/dados/FI/CAD/DADOS/cad_fi.csv'

s = requests.get(url).text

df_cad_fi = pd.read_csv(io.StringIO(s), sep = ";", encoding = "ISO-8859-1")


  df_cad_fi = pd.read_csv(io.StringIO(s), sep = ";", encoding = "ISO-8859-1")


In [3]:
df_cad_fi.head()

Unnamed: 0,TP_FUNDO,CNPJ_FUNDO,DENOM_SOCIAL,DT_REG,DT_CONST,CD_CVM,DT_CANCEL,SIT,DT_INI_SIT,DT_INI_ATIV,...,CPF_CNPJ_GESTOR,GESTOR,CNPJ_AUDITOR,AUDITOR,CNPJ_CUSTODIANTE,CUSTODIANTE,CNPJ_CONTROLADOR,CONTROLADOR,INVEST_CEMPR_EXTER,CLASSE_ANBIMA
0,FACFIF,00.000.684/0001-21,DEUTSCHE BANK FDO APLIC QUOTAS FDO INV FINANCE...,2003-04-30,1994-12-20,19.0,2000-08-01,CANCELADA,2000-08-01,,...,,,,,,,,,,
1,FACFIF,00.000.731/0001-37,ITAMARITI CASH FUNDO APLICACAO QUOTAS FDOS INV...,2003-04-30,1994-05-18,40681.0,1996-01-26,CANCELADA,1996-01-26,,...,,,,,,,,,,
2,FACFIF,00.000.732/0001-81,FUNDO APLIC. QUOTAS DE F.I. SANTANDER CURTO PRAZO,2003-04-30,1994-05-24,27.0,1999-09-03,CANCELADA,1999-09-03,,...,,,,,,,,,,
3,FACFIF,00.000.740/0001-28,FUNDO DE APLIC EM QUOTAS DE FUNDOS DE INV BMC ...,2003-04-30,1994-05-23,40690.0,1996-06-10,CANCELADA,1996-06-10,,...,,,,,,,,,,
4,FACFIF,00.000.749/0001-39,BALANCE FUNDO APLICACAO QUOTAS FUNDO INVESTIME...,2003-04-30,1994-05-12,35.0,2000-06-26,CANCELADA,2000-06-26,,...,,,,,,,,,,


In [4]:
df_cad_fi.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 78337 entries, 0 to 78336
Data columns (total 41 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   TP_FUNDO            78337 non-null  object 
 1   CNPJ_FUNDO          78337 non-null  object 
 2   DENOM_SOCIAL        78337 non-null  object 
 3   DT_REG              78337 non-null  object 
 4   DT_CONST            76238 non-null  object 
 5   CD_CVM              78333 non-null  float64
 6   DT_CANCEL           42205 non-null  object 
 7   SIT                 78337 non-null  object 
 8   DT_INI_SIT          78337 non-null  object 
 9   DT_INI_ATIV         56775 non-null  object 
 10  DT_INI_EXERC        60644 non-null  object 
 11  DT_FIM_EXERC        60644 non-null  object 
 12  CLASSE              65039 non-null  object 
 13  DT_INI_CLASSE       65039 non-null  object 
 14  RENTAB_FUNDO        49201 non-null  object 
 15  CONDOM              64691 non-null  object 
 16  FUND

### Write the dataframe into the SQLite database

In [5]:
# Connect to a local SQLite database previously created
conn = sqlite3.connect('D:/finance_data/finance_database.db') 

df_cad_fi.to_sql('CVM_funds_cadastral_info',conn,if_exists='replace',index=False)

conn.commit()
conn.close()