# **EXPLORER FOR SEC FILINGS**
<hr>

## Inter IIT Tech Meet 10.0 (2022)

![image](https://www.sec.gov/edgar/search/images/edgar-logo-2x.png)
![image](https://interiit-tech.org/static/media/logo_1.f4d40e83.png)

In this Notebook, we shall be looking into utilizing the [EDGAR](https://www.sec.gov/edgar/searchedgar/) API to explore the SEC filings of a company. We shall be using the python library python-edgar to access the API. Be careful, the API is limited to 10 requests per second or smth, idk. If a black SUV shows up out in the open, it's probably because you're doing something wrong.

## Objectives
- Scrape Data from the company's History since inception
- Use 10-Q 10-K and 8-K filings to get the company's financial statements
- Use Financial Statements to get the company's balance sheet, income statement, cash flow statement, and ratios
- Use the data to get the company's current assets, liabilities, and equity
- Generate SaaS Metrics
- Generate a Financial Statement Analysis
- Use the metrics with Deep Learning Systems to give Insightful Results

Graciaz<br>
Kaushik Dey


## Setup and Importing Modules

We will need
- Edgar for the API
- Pandas for data manipulation
- Matplotlib for plotting
- OS for file system paths

In [8]:
import edgar
import pandas as pd
import matplotlib.pyplot as plt
import os
import numpy as np

## Importing SaaS Companies
### (Optional if you have the Output Files Already)

- Get the name of the required companies from the databbase
- Find CIK Number for each company
- Put it in a table

Current implementation takes anywhere between 12-20 minutes to run, so feel free to optimize the algorithm if you have time

In [None]:
cwd = os.getcwd()
path = os.path.join(cwd, 'data','edgar-data')

#Find Company Names
company_names = pd.read_csv("data/company_list.csv")
company_names["CIK"] = "";
# Define cik_list
#Its a percent-dollar seperated values, sorry for that
all_cik = pd.read_csv("data/all_cik.pdsv", sep="^", lineterminator='\n')

found = False #To highlight if a company is found

ciklength = len(all_cik.index) # Number of CIK Entries

itr = 0
previtr = 5

for i in company_names.index:
    found = False
    itr = 0
    while itr < ciklength:
        try:
            if company_names["Company"][i] == all_cik["NAME"][itr]:
                print("found it! (", i+1 , ") " , all_cik["NAME"][itr], "at line", itr+2 )
                company_names["CIK"][i] = all_cik["CIK"][itr]
                found = True
                previtr = itr
                break
            itr += 1
        except:
            print("Error in company name: " , company_names["Company"][i] , " and " , all_cik["NAME"][itr] , " at cik file line " , itr+2)
    if not found:
        print("Not found: " , company_names["Company"][i])
        # print("Starting Again From: " , itr+2, " ", all_cik["NAME"][previtr])

company_names.to_csv("data/output.csv", index=False)
        

## Importing SaaS Companies

### (if you don't have the Output Files Already go to previous step)

- Import data
- Fill NaN with zeroes
- convert to int64
- Save to CSV(optional)

In [13]:
company_names = pd.read_csv("data/output.csv")
company_names["CIK"] = company_names["CIK"].fillna(0).apply(np.int64)
# company_names.to_csv("data/output.csv", index=False)
company_names.isna().sum()

Company    0
CIK        0
dtype: int64

## Downloading Filings

#### Using `python-edgar` to access the API
- Completely Selective Step


In [None]:
edgar.download_index(path,1996,"Indian Institute of Technology, Bhubaneswar 20CS01043@iitbbs.ac.in");
print("Downloaded Files saved at: ", path)