# Using [FindACode](https://www.findacode.com/tools/map-a-code/cpt-hcpcs-ccs.php)


There's different ways of mapping the procedure codes to CCS
 - one is to use the coded min max distance within a range to allocate a token to a code as done in the python script
 - the other is to use the findacode mapping system and manually entering the codes and obtaining the tokens
     - for this you need to create an account and enter all the possible codes that can happen in the dataset
     - then download all the corresponding output to create a procedure CCS map
     
 
This is different from the diagnosis code mapping.

## MIMIC III example

In [None]:
import re
import csv
import pandas as pd
import os
import pickle
## PATH TO CPTEVENTS FILE IN MIMIC III
cpt_events_path = './data/CPTEVENTS.csv'
df_cptevents = pd.read_csv(cpt_events_path)


# extract all CPT codes
def filt(x):
    reg = r'[a-zA-Z]+'
    if (len(re.findall(reg, str(x)))):
        return str(x)
    else:
        return str(x)
        
all_cpts = list(set(map(filt, list(df_cptevents['CPT_CD']))))

# write them to text files in chunks of 500 as there is a 
# code limit entry for the free version in findacode
for i in range(len(all_cpts) // 500 + 1):
    with open('all_cpts{}.txt'.format(i), 'w') as f:
        for item in all_cpts[i*500:500*(i+1)]:
            f.write("{}\n".format(item))
    f.close()

### Next enter all the text files into findacode and download the corresponding 
### cpt-hcpcs-ccs.csv files

I have done this for MIMIC III although not sure if sharing the text files is legal.


**TODO**: look into this and release into repo....

In [None]:
path = './'
files = os.listdir(path)

# for example if you had 5 text files in the previous step
# there should be 5 cpt-hcpcs-ccs.csv files
# below code simply concatenates them
files = [f for f in files if 'cpt-hcpcs-ccs' in f]
df_map = pd.DataFrame()
for f in files:
    tdf = pd.read_csv(os.path.join(path, f))
    df_map = df_map.append(tdf, ignore_index=True)
df_map = df_map.rename(columns={'css': 'ccs'}) # named incorrectly from dump on the findacode website
                                               # last time i ran this was in 2019, they might have fixed this
                                               # check the output csv files.
df_map = df_map.dropna()

code_ccs_map = dict(zip(df_map.code, df_map.ccs))
with open('./code_ccs_map.pkl', 'wb') as handle:
    pickle.dump(code_ccs_map, handle, protocol=pickle.HIGHEST_PROTOCOL)