# Welcome to the EDGAR API Instructions

### Table of Contents:
* What is EDGAR?
* How can I access the API?
* What can I use the API for?
* What can I do with this data?

## What is EDGAR?

EDGAR is the SEC Database trhough which all companies, domestic and foreign, are require to file their registration statement, periodic reports, and other forms electronically. The SEC then makes these forms available both trough the SEC EDGAR website, as well as, the SEC EDGAR API.

* SEC EDGAR - for instructions for searching the EDGAR database
    *   https://www.sec.gov/edgar
* SEC EDGAR API - API Documantation/Instructions
    *   https://www.sec.gov/edgar/sec-api-documentation


## How can I access the API?

To easily navigate the EDGAR API you will need to download a few additional resources provided to you through the EDGAR webiste.

* Ticker, CIK, and name company list
    * link: https://www.sec.gov/files/company_tickers.json
        * for latest link scroll to the bottom of this page: https://www.sec.gov/os/accessing-edgar-data
    * This is a complete list of all companies that have filled one or more forms with the SEC. The company Ticker are what is used to distinguishe the company's stock or stock class within the Stock Exchange it trades in. Stock Exchanges generally require unique Tickers as the unique identifying object for the system. These are generally consistent of one or more capital letters, generaly up to four or five, although not always, some stock exchanges may use a sequence of numbers or numbers and letter as their unique identifier. The CIK are the unique indetifiers created and attriubuted by the SEC for the management of it's database, as it over looks the filing for multiple exchanges, it is possible that tickers are not always unique for all companies filling. 
    * Having a complete ticker, CIK, and name list for the companies is important because it can help you identify the different companies you are looking for, but also the different share classes or types of shares within the same or different exchanges.

With the complete ticker list downloaded (preferably within the same working directory as your working file for ease of use) we can get started. First we will create a function that takes the CIK as it is in the file and change them to fit the CIKXXXXXXXXX1 convention needed to make the resquests for data using the RESTful API from EDGAR. 

In [13]:
import pandas as pd 
import os

cwd = os.getcwd()
tickers = pd.read_json(cwd+'/company_tickers.json', orient='index')

def get_cik(ticker, tickers):
    cik = str(tickers[tickers['ticker']==str(ticker)]['cik_str'].iloc[0]).zfill(10)
    cik = "CIK"+str(cik)+".json"
    return cik, tickers[tickers['ticker']==ticker]

print('This is the first ouput:')
print(get_cik('T',tickers)[0])
print('This is the second ouput:')
print(get_cik('T',tickers)[1])
print('This is the complete ouput:')
print(get_cik('T',tickers))

This is the first ouput:
CIK0000732717.json
This is the second ouput:
    cik_str ticker      title
98   732717      T  AT&T INC.
This is the complete ouput:
('CIK0000732717.json',     cik_str ticker      title
98   732717      T  AT&T INC.)


Now we can start looking through the SEC database using the company's ticker.

In [110]:
import requests

cik = get_cik('AAPL', tickers)
url = "http://data.sec.gov/api/xbrl/companyfacts/"
headers = {
    "User-Agent": "A X ax7@pt.me",
    "Accept-Encoding": "gzip, deflate, br",
    "Host": "data.sec.gov"}
url = url+str(cik[0])
print(url)
response = requests.get(url=url, headers=headers)
response = response.json()
print(response.keys())

tmp_facts = response['facts']
print(tmp_facts.keys())

tmp_dei = tmp_facts['dei']
#tmp_invest = tmp_facts['invest']
tmp_us_gaap = tmp_facts['us-gaap']
print(tmp_dei.keys())
#print(tmp_invest.keys())
print(tmp_us_gaap.keys())

http://data.sec.gov/api/xbrl/companyfacts/CIK0000320193.json
dict_keys(['cik', 'entityName', 'facts'])
dict_keys(['dei', 'us-gaap'])
dict_keys(['EntityCommonStockSharesOutstanding', 'EntityPublicFloat'])
dict_keys(['AccountsPayable', 'AccountsPayableCurrent', 'AccountsReceivableNetCurrent', 'AccruedIncomeTaxesCurrent', 'AccruedIncomeTaxesNoncurrent', 'AccruedLiabilities', 'AccruedLiabilitiesCurrent', 'AccruedMarketingCostsCurrent', 'AccumulatedDepreciationDepletionAndAmortizationPropertyPlantAndEquipment', 'AccumulatedOtherComprehensiveIncomeLossAvailableForSaleSecuritiesAdjustmentNetOfTax', 'AccumulatedOtherComprehensiveIncomeLossCumulativeChangesInNetGainLossFromCashFlowHedgesEffectNetOfTax', 'AccumulatedOtherComprehensiveIncomeLossForeignCurrencyTranslationAdjustmentNetOfTax', 'AccumulatedOtherComprehensiveIncomeLossNetOfTax', 'AdjustmentsToAdditionalPaidInCapitalSharebasedCompensationRequisiteServicePeriodRecognitionValue', 'AdjustmentsToAdditionalPaidInCapitalTaxEffectFromShareBas

In [112]:
tmp_us_gaap_df = pd.DataFrame(columns=['label', 'description', 'units'])

for i, e in enumerate(tmp_us_gaap.keys()):
    temp_df = pd.json_normalize(tmp_us_gaap[e])
    if len(temp_df.columns) == 3:
        #compile dataframe with aall labels, description, and dict with the data
        temp_df.columns = ['label', 'description', 'units']
        tmp_us_gaap_df = pd.concat([tmp_us_gaap_df, temp_df], axis=0)
    else:
        #filter out and show us the none pure items
        print(temp_df.loc[0])
    
print(tmp_us_gaap_df)

label          Finite-Lived Intangible Assets, Useful Life, M...
description    The maximum useful life of a major finite-live...
units.pure     [{'start': '2009-09-27', 'end': '2010-09-25', ...
units.Year     [{'start': '2010-09-26', 'end': '2011-09-24', ...
Name: 0, dtype: object
label          Finite-Lived Intangible Assets, Useful Life, M...
description    The minimum useful life of a major finite-live...
units.pure     [{'start': '2009-09-27', 'end': '2010-09-25', ...
units.Year     [{'start': '2010-09-26', 'end': '2011-09-24', ...
Name: 0, dtype: object
label          Share-based Compensation Arrangement by Share-...
description    The weighted average period between the balanc...
units.pure     [{'start': '2009-09-27', 'end': '2010-06-26', ...
units.Year     [{'start': '2010-09-26', 'end': '2011-03-26', ...
Name: 0, dtype: object
                                                label  \
0            Accounts Payable (Deprecated 2009-01-31)   
0                           Accounts 

In [121]:
label_list = list(tmp_us_gaap_df['label'])
label = label_list[15]

data_df = pd.json_normalize(tmp_us_gaap_df[tmp_us_gaap_df['label']==label]['units'][0])
print(label)
print(tmp_us_gaap_df[tmp_us_gaap_df['label']==label]['label'][0])
print(tmp_us_gaap_df[tmp_us_gaap_df['label']==label]['description'][0])
print(data_df)

Advertising Expense
Advertising Expense
Amount charged to advertising expense for the period, which are expenses incurred with the objective of increasing revenue for a specified brand, product or product line.
         start         end         val                  accn    fy  fp  form  \
0   2007-09-30  2008-09-27   486000000  0001193125-10-238044  2010  FY  10-K   
1   2008-09-28  2009-09-26   501000000  0001193125-11-282113  2011  FY  10-K   
2   2008-09-28  2009-09-26   501000000  0001193125-10-238044  2010  FY  10-K   
3   2009-09-27  2010-09-25   691000000  0001193125-13-170623  2012  FY   8-K   
4   2009-09-27  2010-09-25   691000000  0001193125-12-444068  2012  FY  10-K   
5   2009-09-27  2010-09-25   691000000  0001193125-11-282113  2011  FY  10-K   
6   2009-09-27  2010-09-25   691000000  0001193125-10-238044  2010  FY  10-K   
7   2010-09-26  2011-09-24   933000000  0001193125-13-416534  2013  FY  10-K   
8   2010-09-26  2011-09-24   933000000  0001193125-13-170623  2012  F