# NPI-1: NPI is in NPPES

Description: Checking coverage rate of NPIs existing on NPPES. Test pass if coverage is reasonable. 

Starting Author: Amy Jin (amy@careset.com)

Date: April 30th, 2018

https://docs.google.com/spreadsheets/d/1IYg01IpssJaWHo6KxO4_dSDgXtYNFy41S5cIHFLvlGQ/edit#gid=604789549

## Connection to Parenthood Server

In [1]:
# Packages import
import os
import sys
import numpy as np
import pandas as pd
from collections import Counter
import operator
import mysql.connector
import sshtunnel
import pureyaml

# Handle path
project_dir = !pwd  # dir of current script/notebook file
config_file = open(project_dir[0] + "/db.yaml");
config = pureyaml.load(config_file.read());

# Argument dictionary for sshtunnel
ssh_config = {
    'ssh_address_or_host': ('parenthood.set.care', 22),
    'ssh_username':        config['ssh_username'],
    'ssh_password':        config['ssh_password'],
    'remote_bind_address': ('127.0.0.1', 3306),
    'local_bind_address':  ('0.0.0.0', 3333),
}

# Argument dictionary for mysql.connector
mysql_config = {
    'user':     config['mysql_user'],
    'password': config['mysql_passwd'],
    'host':     config['mysql_host'],
    'database': 'patch',
    'port':     3333,
}

# Connect to Parenthood server
with sshtunnel.SSHTunnelForwarder(**ssh_config) as tunnel:
    print('SSH tunneling successful on port: {}'.format(tunnel.local_bind_port))
    connection = mysql.connector.connect(**mysql_config)
    cur = connection.cursor()
    print('MySQL server connected successfully!')

SSH tunneling successful on port: 3333
MySQL server connected successfully!


## Test Function

In [2]:
# --------------------------------------- Inputs: ---------------------------------------
# 1) db_name:                database name in server
# 2）table_name:             table name
# 3) col_name:               column to test
# --------------------------------------- Outputs: --------------------------------------
# 1) Test result:
#     - the number of distinct npi that are not in NPPES and 
#     - total number of distinct npi in the testing file


def npi_1(db_name, table_name, col_name):
    
    with sshtunnel.SSHTunnelForwarder(**ssh_config) as tunnel:
        connection = mysql.connector.connect(**mysql_config)
        cur = connection.cursor()
            
        print ('Test file: {}.{}'.format(db_name, table_name))
        print ('\n') 
        
        # MySQL query to get distinct count of NPIs in the file and also in NPPES
        query = ('''
                SELECT COUNT(DISTINCT A.{col1})
                FROM {db}.{t1} AS A
                LEFT JOIN scratch.npi_quick AS B
                ON A.{col1} = B.npi
                WHERE B.npi IS NULL AND A.{col1} NOT LIKE '99999%';
        '''.format(db = db_name, t1 = table_name, col1 = col_name))
        
        cur.execute(query)

        print ("The number of distinct {} that are not in NPPES is:".format(col_name) + '\n')
        for row in cur.fetchall():
            for i in range(0,len(row)):
                print (str(row[i]))
            print ('\n')
        
        # MySQL query to get distinct count of NPIs in the file
        query = ('''
                SELECT COUNT(DISTINCT A.{col1})
                FROM {db}.{t1} AS A
                WHERE A.{col1} NOT LIKE '99999%';
        '''.format(db = db_name, t1 = table_name, col1 = col_name))
        
        cur.execute(query)

        print ("The number of distinct {} in {}.{} is:".format(col_name, db_name, table_name) + '\n')
        for row in cur.fetchall():
            for i in range(0,len(row)):
                print (str(row[i]))
            print ('\n')
            
        cur.close()
        connection.close()

## Test Example

In [3]:
npi_1('client_abbvie','C_AB_MM_PATIENTS_RQ18R24_3', 'npi')

Test file: client_abbvie.C_AB_MM_PATIENTS_RQ18R24_3


The number of distinct npi that are not in NPPES is:

882


The number of distinct npi in client_abbvie.C_AB_MM_PATIENTS_RQ18R24_3 is:

347567




In [7]:
npi_1('client_abbvie','C_AB_MM_PDE_PD16R24_12', 'npi')

Test file: client_abbvie.C_AB_MM_PDE_PD16R24_12


The number of distinct npi that are not in NPPES is:

0


The number of distinct npi in client_abbvie.C_AB_MM_PDE_PD16R24_12 is:

12233




In [5]:
npi_1('client_abbvie','C_AB_MM_HCPCS_RQ18R24_3','npi')

Test file: client_abbvie.C_AB_MM_HCPCS_RQ18R24_3


The number of distinct npi that are not in NPPES is:

48


The number of distinct npi in client_abbvie.C_AB_MM_HCPCS_RQ18R24_3 is:

19331




In [3]:
npi_1('client_abbvie','CareSet_Internal_Only_AbbVie_MM_HCP_Targeting','`Healthcare Professional (HCP) NPI`')

Test file: client_abbvie.CareSet_Internal_Only_AbbVie_MM_HCP_Targeting


The number of distinct `Healthcare Professional (HCP) NPI` that are not in NPPES is:

0


The number of distinct `Healthcare Professional (HCP) NPI` in client_abbvie.CareSet_Internal_Only_AbbVie_MM_HCP_Targeting is:

21777


