# Automate HCPCS and ICD-10 Check for Que

Every quarter, the business requests HCPCS and ICD10 Codes to be updated in the master list for **Prism**, **Que**, **Salesforce**, and **Essette**. This script is used to format files to update and create new HCPCS in Salesforce.

This script will format the file given by the business into a format that is acceptable by Salesforce.

## Prerequisites
- No prior knowledge of Python is required to run this script. However, that knowledge will help you understand the logic and syntax of the script. Specifically, we will be using the **pandas** library to transpose the file given by the business to a DataFrame, then we will use that DataFrame to create a new DataFrame with the necessary information. We will also be using the **datetime** library specifically for naming conventions of our file.
- You will need to **rename** the file given by the business to **request.csv**. If this step is not done, then this script **will not run** properly.
- You will need to save this Jupyter Notebook and the file **request.csv** in the same directory, e.g. in the same location on your local machine.
- You will need a method to run Jupyter Notebooks on your local machine. This script assumes Anaconda has already been installed to your local machine. If Anaconda is not installed in your local machine, then you will need to install it via https://www.anaconda.com/docs/getting-started/anaconda/install.

## Overview of the Process
This script will read the files **request.csv** and **salesforce.csv** and load them into a DataFrame using the pandas library. 

Since Salesforce only cares about the following columns, we will also strip the unnecessary columns from **request**. The columns we will be using from **request** are
- **HCPCS**
- **Long Description**
- **Short Description**

We will be using all columns from **salesforce**. After we read the files, we will then join **salesforce** onto **request** via the **HCPCS** column. Finally, we will compare columns to find
1. HCPCS that need to be created in Salesforce
2. HCPCS that need to have Long Description updated in Salesforce
3. HCPCS that need to have Short Description updated in Salesforce
4. HCPCS that need to be marked as Inactive in Salesforce
5. HPCCS that need no changes 

Everything will be outputed to their own separate .csv files.

In [2]:
# Import Libraries
import pandas as pd
from datetime import datetime

today = datetime.today() # Get Today's Date
formatted_date = today.strftime("%m%d%Y") # Strip Today's Date in the MMDDYYYY Format

# Filenames # Replace with your actual file name
master_hcpc_file = "master-list.csv"
master_icd10_file = "icd10cm_codes_2026.txt"
que_hcpcs_file = "Que-HCPCS.csv"
que_icd10_file = "Results-4.csv"

In [3]:
# Load the CSV files into a DataFrame
master_hcpc = pd.read_csv(master_hcpc_file, encoding="cp1252", dtype=str)
master_icd10 = pd.read_csv(master_icd10_file,
                           sep = "\t",
                           header = None,
                           names = ["raw"],
                           dtype = str)
que_hcpcs = pd.read_csv(que_hcpcs_file, encoding="cp1252", dtype=str) # Replace with your actual file name
que_icd10 = pd.read_csv(que_icd10_file, dtype = str)

In [4]:
master_icd10.head()

Unnamed: 0,raw
0,"A000 Cholera due to Vibrio cholerae 01, bio..."
1,"A001 Cholera due to Vibrio cholerae 01, bio..."
2,"A009 Cholera, unspecified"
3,"A0100 Typhoid fever, unspecified"
4,A0101 Typhoid meningitis


In [5]:
# Clean the Data
# HCPCS Master List Clean Up
master_hcpc["HCPCS"] = master_hcpc["HCPCS"].str.replace(r"[^a-zA-Z0-9 ,]", "", regex=True)
master_hcpc["Long Description"] = master_hcpc["Long Description"].str.replace(r"[^a-zA-Z0-9 ,]", "", regex=True)
master_hcpc["Short Description"] = master_hcpc["Short Description"].str.replace(r"[^a-zA-Z0-9 ,]", "", regex=True)

# ICD10 Master List Clean Up
# Split the whitespace run between Code and Description
master_icd10[["Code", "Description"]] = master_icd10["raw"].str.strip().str.split(r"\s+",
                                                                                 n = 1,
                                                                                 expand = True)
# Drop the raw Column
master_icd10 = master_icd10.drop(columns = ["raw"])

In [6]:
master_icd10.to_csv("icd10-master-list.csv", index = False)

In [7]:
# Using Set Logic, we'll compare the sets of Master List HCPCS vs Prism HCPCS
codes_in_master = set(master_hcpc["HCPCS"])
codes_in_que = set(que_hcpcs["Code"])

only_in_master = codes_in_master - codes_in_que
only_in_que = codes_in_que - codes_in_master
in_both = codes_in_master & codes_in_que

In [8]:
print("There are " + str(len(only_in_master)) + " HCPCS that need to be uploaded to Que")
print("")
print("There are " + str(len(only_in_que)) + " HCPCS that are in Que that are Not in the Master List")

There are 0 HCPCS that need to be uploaded to Que

There are 1954 HCPCS that are in Que that are Not in the Master List


In [9]:
# Convert the Set into a Dataframe
only_in_master_df = master_hcpc[master_hcpc["HCPCS"].isin(only_in_master)]

# Output the DataFrames to CSV Files
only_in_master_df.to_csv("Fix_HCPCS_In_Que.csv", index = False)
#new_hcpcs.to_csv("New_HCPCS.csv", index = False)

# ICD-10 Check

We will now undergo a check on the Master List ICD-10 Codes vs which ICD-10 Codes have been uploaded to Prism. We will be using another **Set** comparison between Master List AND Prism.

**Additional Notes**: 
- The table name in Prism MDM that houses the ICD-10 Codes is `list_dtls`
- The column name `listl_dtl_code` maps to **Code** in the Master List
- The column name `list_dtl_desc` maps to **Description** in the Master List
- The column name `dtl_id` maps to the **URL** in the Prism UI

In [11]:
que_icd10.head()

Unnamed: 0,ICD10,TSLastModified,Code,CodeDesc,CodeShort,Rank,CreatedBy,CreateDate,LastModifiedBy,LastModifiedDate,StartDate,EndDate,PrismID
0,6,0x0000000003203D46,37274,Amputation,,,,,1,2016-10-25 22:00:44.387,2015-10-01,2016-09-30,
1,7,0x0000000003203D47,S37274,Amputation,,,,,1,2016-10-25 22:00:44.387,2015-10-01,2016-09-30,
2,8,0x0000000003203D48,test,test,,,,,1,2016-10-25 22:00:44.387,2015-10-01,2016-09-30,
3,9,0x0000000003203D49,test2,test2,,,,,1,2016-10-25 22:00:44.387,2015-10-01,2016-09-30,
4,10,0x0000000003203D4A,test3,test3,,,,,1,2016-10-25 22:00:44.387,2015-10-01,2016-09-30,


In [12]:
# Using Set Logic, we'll compare the sets of Master List vs Prism
codes_in_master = set(master_icd10["Code"])
codes_in_que = set(que_icd10["Code"])

only_in_master = codes_in_master - codes_in_que
only_in_que = codes_in_que - codes_in_master
in_both = codes_in_master & codes_in_que

In [13]:
# Print Results
print("There are " + str(len(only_in_master)) + " ICD-10 Codes that need to be uploaded to Que")
print("")
print("There are " + str(len(only_in_que)) + " ICD-10 Codes that are in Que that are Not in the Master List")

There are 0 ICD-10 Codes that need to be uploaded to Que

There are 1273 ICD-10 Codes that are in Que that are Not in the Master List


In [14]:
# Convert the Set into a DataFrame
only_in_master_df = master_icd10[master_icd10["Code"].isin(only_in_master)]
only_in_que_df = que_icd10[que_icd10["Code"].isin(only_in_que)]
# Output the DataFrames to CSV Files
only_in_master_df.to_csv("ICD10s-Not-In-Que.csv", index = False) # Add these ICD-10 Codes to Que from Master List
only_in_que_df.to_csv("Remove_ICD10s_In_Que.csv", index = False) # Remove these ICD-10 Codes from Que because they are NOT on Master List
#new_hcpcs.to_csv("New_HCPCS.csv", index = False)