This file is to build a map between WRDS PERMNO identifier and SEC CIK identifier.

In [1]:
import pandas as pd


Getting my saved tic-cik map:

In [2]:
# Load from your processed folder
df_ticker_cik = pd.read_csv("../data/processed/sec_ticker_cik_mapping.csv")

print(df_ticker_cik.shape)
df_ticker_cik.head()


(12084, 2)


Unnamed: 0,ticker,cik
0,aapl,320193
1,msft,789019
2,brk-b,1067983
3,unh,731766
4,jnj,200406


Cleaning:

In [3]:
# Make sure 'ticker' is lowercase (SEC gives lowercase but double check)
df_ticker_cik['ticker'] = df_ticker_cik['ticker'].str.lower()

df_ticker_cik = df_ticker_cik.rename(columns={'ticker': 'tic'})


# Make sure 'cik' is a 10-digit string
df_ticker_cik['cik'] = df_ticker_cik['cik'].astype(str).str.zfill(10)

df_ticker_cik


Unnamed: 0,tic,cik
0,aapl,0000320193
1,msft,0000789019
2,brk-b,0001067983
3,unh,0000731766
4,jnj,0000200406
...,...,...
12079,hcicu,0001829455
12080,hcicw,0001829455
12081,hawlm,0000046207
12082,hbanm,0000049196


Get permno - tik map:

In [4]:
# Load df_class first
df_class = pd.read_stata("../data/supplement/ccm_from_class/CCM_cleaned_for_class.dta")

# Select only lpermno and tic columns
df_class = df_class[['lpermno', 'tic']]

# Clean ticker to lowercase
df_class['tic'] = df_class['tic'].str.lower()

print(df_class.shape)
df_class


(223001, 2)


Unnamed: 0,lpermno,tic
0,25881.0,ae.2
1,25881.0,ae.2
2,25881.0,ae.2
3,10015.0,amfd.
4,10015.0,amfd.
...,...,...
222996,13104.0,pacd
222997,13104.0,pacd
222998,13861.0,tam
222999,14344.0,salt


Cleaning:

In [5]:
# Drop duplicate permno-tic
df_class = df_class.drop_duplicates(subset=['lpermno', 'tic'])

# Clean up ticker names
df_class['tic'] = df_class['tic'].str.split('.').str[0]

print(df_class.shape)
df_class


(23111, 2)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_class['tic'] = df_class['tic'].str.split('.').str[0]


Unnamed: 0,lpermno,tic
0,25881.0,ae
3,10015.0,amfd
6,10031.0,antq
12,54594.0,air
51,61903.0,aba
...,...,...
222993,13707.0,rdhl
222995,13104.0,pacd
222998,13861.0,tam
222999,14344.0,salt


In [6]:
# 1. Convert lpermno from float to int
df_class['lpermno'] = df_class['lpermno'].astype(int)

# 2. Rename lpermno to permno
df_class = df_class.rename(columns={'lpermno': 'permno'})

print(df_class.dtypes)
df_class


permno     int64
tic       object
dtype: object


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_class['lpermno'] = df_class['lpermno'].astype(int)


Unnamed: 0,permno,tic
0,25881,ae
3,10015,amfd
6,10031,antq
12,54594,air
51,61903,aba
...,...,...
222993,13707,rdhl
222995,13104,pacd
222998,13861,tam
222999,14344,salt


### Creating permno-tic-cik map

In [7]:
df_cik_permno = pd.merge(
    df_ticker_cik,
    df_class,
    on="tic",
    how="inner"
)

print(df_cik_permno.shape)
df_cik_permno


(4921, 3)


Unnamed: 0,tic,cik,permno
0,aapl,0000320193,14593
1,msft,0000789019,10107
2,unh,0000731766,92655
3,jnj,0000200406,22111
4,v,0001403161,46842
...,...,...,...
4916,gldi,0001053092,84614
4917,gll,0001415311,67934
4918,gru,0000352960,43465
4919,gru,0000352960,65613
