# **EXPLORER FOR SEC FILINGS**
<hr>

## Inter IIT Tech Meet 10.0 (2022)

![image](https://www.sec.gov/edgar/search/images/edgar-logo-2x.png)
![image](https://interiit-tech.org/static/media/logo_1.f4d40e83.png)

In this Notebook, we shall be looking into utilizing the [EDGAR](https://www.sec.gov/edgar/searchedgar/) API to explore the SEC filings of a company. We shall be using the python library python-edgar to access the API. Be careful, the API is limited to 10 requests per second or smth, idk. If a black SUV shows up out in the open, it's probably because you're doing something wrong.

## Objectives
- Scrape Data from the company's History since inception
- Use 10-Q 10-K and 8-K filings to get the company's financial statements
- Use Financial Statements to get the company's balance sheet, income statement, cash flow statement, and ratios
- Use the data to get the company's current assets, liabilities, and equity
- Generate SaaS Metrics
- Generate a Financial Statement Analysis
- Use the metrics with Deep Learning Systems to give Insightful Results

Graciaz<br>
Kaushik Dey

In [50]:
import requests
import pandas as pd
import matplotlib.pyplot as plt
import pandasgui as pdgui
import os
import numpy as np
import json
import time
import pyjsonviewer as pjv
from bs4 import BeautifulSoup as bs

### Explore the Data

- Find the Data Structure
- Figure out the Form Links
- Find a way to store some of the data to train models

In [None]:
company_data = pd.read_csv('data/company_summary.csv')


### Analysing the JSON Structure

The Structure of the File Looks Something Like

```json
"filings": {
    "recent": {
        "accessionNumber": [] # List of accession numbers
        "filingDate": []
        "reportDate": []
        "acceptanceDateTime": []
        "act": []
        "form": [] # Look for 10-K 8-K 10-Q here
        "fileNumber": []
        "filmNumber": []
        "items": []
        "size": []
        "isXBRL": []
        "isInlineXBRL": []
        "primaryDocument": [] # The Document we need
        "primaryDocDescription": []

    }
    files: []
}
```

### Lets now Explore the data in single file

In [None]:
file = json.loads(company_data["HISTORY"][0])

head = {
    "User-Agent": "Alpha-Explorer/1.0",
    "Connection": "keep-alive"
}

sample = "https://sec.gov/Archives/edgar/data/" 
sample += file["cik"] + "/"
sample += file["filings"]["recent"]["accessionNumber"][0].replace("-", "")
sample += "/" + file["filings"]["recent"]["primaryDocument"][0]

response= requests.get(sample, headers=head)

print(response.text)

### Lets Now see how iterating through the data works

In [30]:
# Now We Try Iterating Thorugh the Data
file = json.loads(company_data["HISTORY"][0])
length = len(file["filings"]["recent"]["accessionNumber"])
file_dict = file["filings"]["recent"]
print("10K Form Dates")
for i in range(length):
    if file_dict["form"][i] == "10-K":
        print( i,") ",file_dict["filingDate"][i])

10K Form Dates
76 )  2021-02-25
147 )  2020-02-28
220 )  2019-02-26
288 )  2018-02-27
367 )  2017-02-24
439 )  2016-03-10
566 )  2015-02-26


### Lets now try and scrape a single data for some real data
- Get a 10-K Form
- Get the Form's Data
- Try scrape for relevant tables
- Find Table to do stuff

In [59]:
file = json.loads(company_data["HISTORY"][178]) #Oblong
print("Corp: ",file["name"])
length = len(file["filings"]["recent"]["accessionNumber"])
file_dict = file["filings"]["recent"]


head = {
    "User-Agent": "Alpha-Explorer/1.0",
    "Connection": "keep-alive"
}

for i in range(length):
    if file_dict["form"][i] == "10-K":

        #Define the URL
        print( i,") Date:",file_dict["filingDate"][i])
        sample = "https://sec.gov/Archives/edgar/data/" 
        sample += file["cik"] + "/"
        sample += file["filings"]["recent"]["accessionNumber"][i].replace("-", "")
        sample += "/" + file["filings"]["recent"]["primaryDocument"][i]
        print("Url: ",sample)
        


Corp:  Oblong, Inc.
32 ) Date: 2021-03-30
Url:  https://sec.gov/Archives/edgar/data/746210/000074621021000024/glow-20201231.htm
75 ) Date: 2020-05-15
Url:  https://sec.gov/Archives/edgar/data/746210/000074621020000022/glow-20191231x10k.htm
122 ) Date: 2019-03-08
Url:  https://sec.gov/Archives/edgar/data/746210/000074621019000009/glow-20181231x10k.htm
157 ) Date: 2018-03-07
Url:  https://sec.gov/Archives/edgar/data/746210/000074621018000012/glow-20171231x10k.htm
177 ) Date: 2017-03-31
Url:  https://sec.gov/Archives/edgar/data/746210/000074621017000004/glow-20161231x10k.htm
195 ) Date: 2016-03-17
Url:  https://sec.gov/Archives/edgar/data/746210/000074621016000121/glow-20151231x10k.htm
225 ) Date: 2015-03-05
Url:  https://sec.gov/Archives/edgar/data/746210/000074621015000055/glow-20141231x10k.htm
273 ) Date: 2014-03-06
Url:  https://sec.gov/Archives/edgar/data/746210/000074621014000016/glow-20131231x10k.htm
309 ) Date: 2013-04-01
Url:  https://sec.gov/Archives/edgar/data/746210/0000746210