# Edgar Filings - Shares Outstanding

This notebook shows how to use Algoseek's Edgar Filings dataset to get the shares outstanding for a company. The shares outstanding is the total number of shares of a company's stock that are publicly owned and available for trading. It is an important metric for investors to understand the company's market capitalization and ownership structure. The information is extracted from the 10-Q and 10-K filings of the company. The dataset provides the shares outstanding for each filing date.

NOTE: This dataset provides data as-is from the SEC filings. In the case of shares outstanding data this means that the data is not adjusted for any corporate actions such as stock splits, reverse splits, etc.



In [2]:
import os
import pandas as pd
from IPython.display import HTML, display
import clickhouse_connect
from dotenv import load_dotenv

load_dotenv()

client = clickhouse_connect.get_client(
    host=os.environ["CLICKHOUSE_HOST"],
    port=int(os.environ["CLICKHOUSE_PORT"]),
    user=os.environ["CLICKHOUSE_USER"],
    password=os.environ["CLICKHOUSE_PASSWORD"],
    database=os.environ["CLICKHOUSE_DATABASE"],
)

## Display a time series of a single concept

For this example, we display the time series of the concept `CommonStockSharesOutstanding`, available in the `10-K` and `10-Q `reports, for the ticker: `AMZN`.

### The query


### Query breakdown

The SQL query that is selecting data from a table of financials. Here's a breakdown of what it does:

- `SELECT formatDateTime(t.report_period_end_date, '%Y-%m-%d') as period_date, t.period_fiscal_year as fiscal_year, t.period_fiscal_period as fiscal_period, t.fact_value as value`: This part of the query is selecting four columns from the result. It's formatting the report_period_end_date to a specific format (YYYY-MM-DD) and renaming it to period_date. It's also renaming period_fiscal_year to fiscal_year, period_fiscal_period to fiscal_period, and fact_value to value.

- The `FROM` clause is selecting from a subquery. This subquery is selecting all columns from the financials table where entity_ticker is in a list of tickers and relationship_target_name is 'CommonStockSharesOutstanding'. It's also calculating a row number (rn) for each row, partitioned by period_fiscal_year and period_fiscal_period, and ordered by report_period_end_date in descending order. This means that for each combination of period_fiscal_year and period_fiscal_period, it will assign a unique row number starting from 1 for the most recent report_period_end_date.

- The `WHERE t.rn = 1` clause is filtering the results of the subquery to only include the rows where rn is 1. Because of how rn was calculated, this will be the row with the most recent report_period_end_date for each combination of period_fiscal_year and period_fiscal_period.

- Finally, the `ORDER BY t.report_period_end_date` clause is ordering the final result by report_period_end_date.

In [4]:
params = {"tickers": "AMZN"}

query_ts = """
        SELECT formatDateTime(t.report_period_end_date, '%Y-%m-%d') as period_date,
            t.period_fiscal_year as fiscal_year,
            t.period_fiscal_period as fiscal_period,
            t.fact_value as value
        FROM (
                SELECT *,
                    ROW_NUMBER() OVER(
                        PARTITION BY period_fiscal_year,
                        period_fiscal_period
                        ORDER BY report_period_end_date DESC
                    ) as rn
                FROM financials f
                WHERE f.entity_ticker IN ({tickers: String})
                    AND f.relationship_target_name = 'CommonStockSharesOutstanding'
            ) t
        WHERE t.rn = 1
        ORDER BY t.report_period_end_date
"""

result_ts = client.query(query_ts, parameters=params)

# Convert the result to a pandas DataFrame
df = pd.DataFrame(result_ts.result_rows, columns=result_ts.column_names)

# Display the DataFrame
display(HTML(df.to_html(notebook=True)))

Unnamed: 0,period_date,fiscal_year,fiscal_period,value
0,2009-06-30,2009,2Q,432000000
1,2009-09-30,2009,3Q,433000000
2,2010-09-30,2009,Y,444000000
3,2011-03-31,2011,1Q,452000000
4,2011-06-30,2011,2Q,454000000
5,2011-09-30,2010,Y,451000000
6,2012-09-30,2011,Y,455000000
7,2013-12-31,2012,Y,454000000
8,2014-09-30,2013,Y,459000000
9,2014-12-31,2014,Y,465000000


As you can see, the values reported reflect the 20:1 2022 split. However, since this is as-reported data, it does not contain any adjustment factors. We are working on obtaining the adjustment factors in the Edgar Filings Dataset, but for now, we have them available as a separate dataset.