[Question]: CUI Question

### What’s your question?

Hello,
I’ve run into a problem and a question when running cTAKES. If I have a document and process it through cTAKES, then the XMI output will contain numerous XML tags. The tags our lab is interested in are the CUIs, for example, the XMI tag

<refsem:UmlsConcept xmi:id="16626" codingScheme="SNOMEDCT_US" code="7092007" score="0.0" disambiguated="false" cui="C0025859" tui="T109" preferredText="Metoprolol-containing product"/>

Would indicate the CUI C0025859 for Metoprolol-containing product is found in a given document.

If I look at the input document text, then I can locate three instances of the drug Metoprolol in the document text. When I look at the cTAKES XMI output in the cTAKES XMI CVD viewer, each of the results for Metoprolol is part of ontologyConceptArr, with 4 members each, looking like this:

// found at org.apache.ctakes.typesystem.type.textsem.EventMention
//       org.apache.ctakes.typesystem.type.textsem.MedicationMention
//           ontologyConceptArr = uima.cas.FSArray[4]

<refsem:UmlsConcept xmi:id="16626" codingScheme="SNOMEDCT_US" code="7092007" score="0.0" disambiguated="false" cui="C0025859" tui="T109" preferredText="Metoprolol-containing product"/>
<refsem:UmlsConcept xmi:id="16646" codingScheme="SNOMEDCT_US" code="7092007" score="0.0" disambiguated="false" cui="C0025859" tui="T121" preferredText="Metoprolol-containing product"/>
<refsem:UmlsConcept xmi:id="16616" codingScheme="SNOMEDCT_US" code="372826007" score="0.0" disambiguated="false" cui="C0025859" tui="T109" preferredText="Metoprolol-containing product"/>
<refsem:UmlsConcept xmi:id="16636" codingScheme="SNOMEDCT_US" code="372826007" score="0.0" disambiguated="false" cui="C0025859" tui="T121" preferredText="Metoprolol-containing product"/>

Although not shown here, it is possible for there to be different CUIs within a single uima.cas.FSArray, with this array mapping to a single string of text in the document.

If I walk the XMI file and retrieve all CUIs, then the result will be the CUI C0025859 being found 12 times, however, if I extend the JCasAnnotator_ImplBase java class to extract the CUIs from the jCas annotations, then it only finds this CUI 3 times.

If part of the output needs to include a count of all CUIs found by cTAKES within a given document, which method is correct?

Thanks!

### Context

_No response_

### What category does this question fall under?

None

### Contact Details

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: CUI Question #64

What’s your question?

Context

What category does this question fall under?

Contact Details

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Question]: CUI Question #64

Description

What’s your question?

Context

What category does this question fall under?

Contact Details

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions