Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUI --> CONCEPT Mapping #1

Open
ericaVoss opened this issue Oct 26, 2020 · 8 comments
Open

CUI --> CONCEPT Mapping #1

ericaVoss opened this issue Oct 26, 2020 · 8 comments

Comments

@ericaVoss
Copy link

Would you mind posting the mapping files? :)

bitmoji

@jmbanda
Copy link
Contributor

jmbanda commented Oct 26, 2020

AnankeV2.zip

Find them attached here. I will update the repo later.

@ericaVoss
Copy link
Author

Thank you, took a look at the file.

I'm going to bounce it off the OMOP Vocabulary and make a proper SOURCE_TO_CONCEPT_MAP file. I'll share the SQL with you after if you are interest.

@ericaVoss
Copy link
Author

Basically I created a STCM file from your file. Taking where your file mapped to standard CONCEPT_IDs.

318275 CUIs (only 30%) do not map to a standard concept. We are going to run this in a system that uses CUIs and see if any of them are common.

WITH CTE_CUI AS (
	SELECT DISTINCT CUI
	FROM SCRATCH.dbo.EAV_ANANKE
), 
CTE_CUI_TO_STANDARD AS (
	SELECT a.CUI, c.CONCEPT_ID AS TARGET_CONCEPT_ID, c.CONCEPT_NAME AS TARGET_CONCEPT_NAME, c.VOCABULARY_ID AS TARGET_VOCABULARY_ID
	FROM SCRATCH.dbo.EAV_ANANKE a
		JOIN CONCEPT c
			ON c.CONCEPT_ID = a.CONCEPT_ID
			AND c.STANDARD_CONCEPT = 'S'
)
SELECT 
	c.CUI AS SOURCE_CODE, 
	0 AS SOURCE_CONCEPT_ID, 
	'LYNXCARE_CUI_TO_STANDARD' AS SOURCE_VOCABULARY_ID, 
	NULL AS SOURCE_CODE_DESCRIPTION, 
	CASE WHEN cs.TARGET_CONCEPT_ID IS NULL THEN 0 ELSE cs.TARGET_CONCEPT_ID END AS TARGET_CONCEPT_ID, 
	cs.TARGET_CONCEPT_NAME, cs.TARGET_VOCABULARY_ID, 
	'1970-01-01' AS VALID_START_DATE,
	'2099-12-31' AS VALID_END_DATE,
	NULL AS INVALID_REASON
FROM CTE_CUI c
	LEFT OUTER JOIN CTE_CUI_TO_STANDARD cs
		ON cs.CUI = c.CUI

@ericaVoss
Copy link
Author

ericaVoss commented Jan 3, 2021

We realized that sometimes we can make a double hop to get to a standard code.

WITH CTE_CUI AS (
	SELECT DISTINCT CUI
	FROM SCRATCH.dbo.EAV_ANANKE
), 
CTE_CUI_TO_STANDARD AS (
	SELECT a.CUI, c.CONCEPT_ID AS TARGET_CONCEPT_ID, c.CONCEPT_NAME AS TARGET_CONCEPT_NAME, c.VOCABULARY_ID AS TARGET_VOCABULARY_ID
	FROM SCRATCH.dbo.EAV_ANANKE a
		JOIN CONCEPT c
			ON c.CONCEPT_ID = a.CONCEPT_ID
			AND c.STANDARD_CONCEPT = 'S'
),
CTE_CUI_HOP_TO_STANDARD AS (
	--Sometimes the non-standard codes mapped can get us to a standard code
	SELECT a.CUI, c2.CONCEPT_ID AS TARGET_CONCEPT_ID, c2.CONCEPT_NAME AS TARGET_CONCEPT_NAME, c2.VOCABULARY_ID AS TARGET_VOCABULARY_ID
	FROM SCRATCH.dbo.EAV_ANANKE a
		JOIN CONCEPT c
			ON c.CONCEPT_ID = a.CONCEPT_ID
		JOIN CONCEPT_RELATIONSHIP cr
			ON cr.CONCEPT_ID_1 = c.CONCEPT_ID
			AND cr.RELATIONSHIP_ID = 'Maps To'
                        AND cr.INVALID_REASON IS NULL
		JOIN CONCEPT c2
			ON c2.CONCEPT_ID = cr.CONCEPT_ID_2
), 
CTE_MAPPED AS (
	--Take standard first
	SELECT 
		c.CUI, 
		cs.TARGET_CONCEPT_ID, cs.TARGET_CONCEPT_NAME, cs.TARGET_VOCABULARY_ID
	FROM CTE_CUI c
		JOIN CTE_CUI_TO_STANDARD cs
			ON cs.CUI = c.CUI
	UNION ALL
	--For other codes try to hop to non-standard
	SELECT 
		c.CUI, 
		cs.TARGET_CONCEPT_ID, cs.TARGET_CONCEPT_NAME, cs.TARGET_VOCABULARY_ID
	FROM CTE_CUI c
		JOIN CTE_CUI_HOP_TO_STANDARD cs
			ON cs.CUI = c.CUI
			AND c.CUI NOT IN (SELECT CUI FROM CTE_CUI_TO_STANDARD)
)
SELECT c.CUI AS SOURCE_CODE, 
		0 AS SOURCE_CONCEPT_ID, 
		'LYNXCARE_CUI_TO_STANDARD' AS SOURCE_VOCABULARY_ID, 
		NULL AS SOURCE_CODE_DESCRIPTION, 
		c.TARGET_CONCEPT_ID, c.TARGET_CONCEPT_NAME, c.TARGET_VOCABULARY_ID, 
		'1970-01-01' AS VALID_START_DATE,
		'2099-12-31' AS VALID_END_DATE,
		NULL AS INVALID_REASON
FROM CTE_MAPPED c
UNION ALL
SELECT c.CUI AS SOURCE_CODE, 
		0 AS SOURCE_CONCEPT_ID, 
		'LYNXCARE_CUI_TO_STANDARD' AS SOURCE_VOCABULARY_ID, 
		NULL AS SOURCE_CODE_DESCRIPTION, 
		0 AS TARGET_CONCEPT_ID, 
		NULL AS TARGET_CONCEPT_NAME, 
		NULL AS TARGET_VOCABULARY_ID, 
		'1970-01-01' AS VALID_START_DATE,
		'2099-12-31' AS VALID_END_DATE,
		NULL AS INVALID_REASON
FROM CTE_CUI c
WHERE c.CUI NOT IN (SELECT DISTINCT CUI FROM CTE_MAPPED)

@jmbanda
Copy link
Contributor

jmbanda commented Jan 3, 2021

Thanks! Had in my to-do list to update this repo this week! So thank you for the update!

@ericaVoss
Copy link
Author

There are are 51 codes that have 6+ more CONCEPT_ID maps.

For example:
C3652516-pneumococcus, purified polysaccharides antigen

Maps to:

CONCEPT_ID CONCEPT_ID
40163661 Streptococcus pneumoniae type 1 capsular polysaccharide antigen
40163668 Streptococcus pneumoniae type 10A capsular polysaccharide antigen
40163670 Streptococcus pneumoniae type 11A capsular polysaccharide antigen
40163672 Streptococcus pneumoniae type 12F capsular polysaccharide antigen
40163674 Streptococcus pneumoniae type 14 capsular polysaccharide antigen
40163676 Streptococcus pneumoniae type 15B capsular polysaccharide antigen
40163678 Streptococcus pneumoniae type 17F capsular polysaccharide antigen
40163680 Streptococcus pneumoniae type 18C capsular polysaccharide antigen
40163682 Streptococcus pneumoniae type 19A capsular polysaccharide antigen
40163684 Streptococcus pneumoniae type 19F capsular polysaccharide antigen
40163686 Streptococcus pneumoniae type 2 capsular polysaccharide antigen
40163688 Streptococcus pneumoniae type 20 capsular polysaccharide antigen
40163690 Streptococcus pneumoniae type 22F capsular polysaccharide antigen
40163692 Streptococcus pneumoniae type 23F capsular polysaccharide antigen
40163694 Streptococcus pneumoniae type 3 capsular polysaccharide antigen
40163696 Streptococcus pneumoniae type 33F capsular polysaccharide antigen
40163698 Streptococcus pneumoniae type 4 capsular polysaccharide antigen
40163700 Streptococcus pneumoniae type 5 capsular polysaccharide antigen
40163702 Streptococcus pneumoniae type 6B capsular polysaccharide antigen
40163704 Streptococcus pneumoniae type 7F capsular polysaccharide antigen
40163706 Streptococcus pneumoniae type 8 capsular polysaccharide antigen
40163708 Streptococcus pneumoniae type 9N capsular polysaccharide antigen
40163710 Streptococcus pneumoniae type 9V capsular polysaccharide antigen

We are thinking about just ignoring the maps that go to 6+ CONCEPT_IDs. Just an FYI . . .

@sdebruyn
Copy link

sdebruyn commented Jan 5, 2021

Shouldn't CTE_CUI_HOP_TO_STANDARD filter on valid relationships?

@ericaVoss
Copy link
Author

@sdebruyn - probably proper form, but at least currently all 'Maps To' relationships are valid. I updated the above SQL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants