# 1.1 Accessing API and Obtaining One Earning Call Transcript

All earning call transcripts were obtained from an API, Financial Modeling Prep ([Financial Modeling Prep Link](https://site.financialmodelingprep.com/)).

Earning call transcripts were accessed using an **URL**.
Sample URL: **"https://financialmodelingprep.com/api/v3/earning_call_transcript/ZYME?quarter=1&year=2022&apikey=5c6d8cb2a7a5d65fa7b915e87c83469e"**

Parameters of URL:
*   ZYME is a symbol of a listed company (stock). Replace it with your company of interest
*   quarter = 1,2,3, or 4 (quarter of the earning call transcript)
*   year = #### (year of the earning call transcript)
*   apikey = your key for API






**Note**: Only the users that are subscribed to the API, Financial Modeling Prep, can access the earning call transcripts


In [None]:
# 1. Downloading Earning Call Transcript Data Version 1.0
# author: Yuchen Zhou
# the script downloads earning call transcript data from an API (Financial Modeling Prep)
# and save the data into csv files

try:
    # For Python 3.0 and later
    from urllib.request import urlopen
except ImportError:
    # Fall back to Python 2's urllib2
    from urllib2 import urlopen

import certifi
import json
import pandas as pd
import time

def get_jsonparsed_data(url):
    """
    Receive the content of ``url``, parse it as JSON and return the object.

    Parameters
    ----------
    url : str

    Returns
    -------
    dict
    """
    response = urlopen(url, cafile=certifi.where())
    data = response.read().decode("utf-8")
    return json.loads(data)

# use the following code to test the API

url = ("https://financialmodelingprep.com/api/v3/earning_call_transcript/ZYME?quarter=1&year=2022&apikey=5c6d8cb2a7a5d65fa7b915e87c83469e")
print(get_jsonparsed_data(url)) # should output a dictionary

print(get_jsonparsed_data("https://financialmodelingprep.com/api/v4/earning_call_transcript?symbol=BCOW&apikey=5c6d8cb2a7a5d65fa7b915e87c83469e"))# should see an empty dictionary

Now, we can use get_jsonparsed_data(url) function to obtain the earning call transcripts from a company at a specific year and quarter.
The output of the function is a **dictionary**.

The keys of the dictionary are:


*  symbol: the symbol of the listed company (stock)
*  quarter: the quarter of the earning call transcript
*  year: the year of the earning call transcript
*  date: the date of the earning call transcript
*  content: the content of the earning call transcript







# 1.2 Obtaining Earning Call Transcripts from Financial Service Companies in Recent Five Years

Symbols of listed financial service companies at NASDAQ and NYSE were obtained from [Yahoo Finance](https://ca.finance.yahoo.com/) and stored in a CSV file (Financial service.csv).

In [None]:
financial_service = pd.read_csv("Financial service.csv")

In [None]:
# showing the contents of the csv file
financial_service

Unnamed: 0,Symbol,Current Price,Date,Time,Change,Open,High,Low,Volume,Trade Date,Purchase Price,Quantity,Commission,High Limit,Low Limit,Comment
0,VBOCW,0.0531,2023/04/12,11:44 EDT,-0.009700,0.051981,0.0531,0.0374,3404,,,,,,,
1,PRSRW,0.0768,2023/04/12,15:59 EDT,0.000200,0.750000,0.0804,0.0732,6454,,,,,,,
2,QDROU,10.2000,2023/04/11,15:51 EDT,-0.050000,10.240000,10.2500,10.2000,400,,,,,,,
3,GATEW,0.1000,2023/04/12,09:52 EDT,-0.008000,0.104100,0.1041,0.1000,200,,,,,,,
4,PRLHW,0.1100,2023/04/12,11:16 EDT,-0.000100,0.110000,0.1100,0.1100,10426,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2225,MA,361.7800,2023/04/12,16:00 EDT,-2.350006,366.080000,368.6000,361.0250,2170517,,,,,,,
2226,JPM,128.5000,2023/04/12,16:00 EDT,-0.020004,129.180000,130.4300,128.0603,11586455,,,,,,,
2227,V,227.8100,2023/04/12,16:00 EDT,-0.639999,229.930000,231.5900,227.3300,4252063,,,,,,,
2228,BRK-B,314.5500,2023/04/12,16:03 EDT,0.849976,315.970000,316.9200,313.7200,2639645,,,,,,,


In [None]:
# obtaining earning call transcripts

years=[2022,2021,2020,2019,2018]
quarters=[1,2,3,4]

# first loop iterates through the recent five year
for year in years:

   # the second loop goes through each quarter
    for quarter in quarters:

       # this list will store thedictionaries that contain the earning call transcripts of financial service companies
        financial_df_list=[]

        # the third loop goes through each company in my csv file
        for index, row in financial_service.iterrows():

           # set the URL for accessing the earning call transcript from API
            current_url=("https://financialmodelingprep.com/api/v3/earning_call_transcript/"+row["Symbol"]+"?quarter="+str(quarter)+"&year="+str(year)+"&apikey=5c6d8cb2a7a5d65fa7b915e87c83469e")

          # if the earning call transcript does exist in API, we get the dictionary and put it into financial_df_list
            if len(get_jsonparsed_data(current_url))!=0:
                current_dict=(get_jsonparsed_data(current_url))
                current_df = pd.DataFrame.from_dict(current_dict)
                financial_df_list.append(current_df)
                print(len(financial_df_list),"samples are being added to the dataframe")

        # convert financial_df_list into a Pandas dataframe
        final_df = pd.concat(financial_df_list)

        # store it in a csv file for future usage
        final_df.to_csv("financial_service_"+str(year)+"_Q"+str(quarter)+'.csv',index=False)
