## Run Entity Extraction

#### From PubMed Article nejmoa1503184.pdf

Paragraphs are manually copied and pasted as text chunks. Sometimes paragraphs had to be split in order to comply with 512 length restriction

In [1]:
text1 = """BACKGROUND In a phase 2 trial, selexipag, an oral selective IP prostacyclin-receptor agonist, was shown to be beneficial in the treatment of pulmonary arterial hypertension."""

text2 = """METHODS In this event-driven, phase 3, randomized, double-blind, placebo-controlled trial, we randomly assigned 1156 patients with pulmonary arterial hypertension to receive placebo or selexipag in individualized doses (maximum dose, 1600 μg twice daily). Patients were eligible for enrollment if they were not receiving treatment for pulmonary arterial hypertension or if they were receiving a stable dose of an endothelin-receptor antagonist, a phosphodiesterase type 5 inhibitor, or both. The primary end point was a composite of death from any cause or a complication related to pulmonary arterial hypertension up to the end of the treatment period (defined for each patient as 7 days after the date of the last intake of selexipag or placebo)."""

text3 = """A primary end-point event occurred in 397 patients 41.6% of those in the placebo group and 27.0% of those in the selexipag group (hazard ratio in the selexipag group as compared with the placebo group, 0.60; 99% confidence interval, 0.46 to 0.78; P < 0.001). Disease progression and hospitalization accounted for 81.9% of the events. The effect of selexipag with respect to the primary end point was similar in the subgroup of patients who were not receiving treatment for the disease at baseline and in the subgroup of patients who were already receiving treatment at baseline (including those who were receiving a combination of two therapies). By the end of the study, 105 patients in the placebo group and 100 patients in the selexipag group had died from any cause. """
text3a = """Overall, 7.1% of patients in the placebo group and 14.3% of patients in the selexipag group discontinued their assigned regimen prematurely because of adverse events. The most common adverse events  in the selexipag group  were consistent with  the known side effects  of prostacyclin, including headache, diarrhea, nausea, and jaw pain."""

text4 = """CONCLUSIONS Among patients with pulmonary arterial hypertension, the risk of the primary composite end point of death or a complication related to pulmonary arterial hypertension was significantly lower with selexipag than with placebo. There was no significant difference in  mortality between the two  study  groups.  """

text5 = """Pulmonary arterial hypertension is a severe disease with a poor prognosis despite available treatment options. Current recommendations support the use of a combination of therapies that target the endothelin, nitric-oxide, and prostacyclin pathways. Despite the benefits of intravenous prostacyclin therapy, many patients with pulmonary arterial hypertension die without ever receiving this treatment. The burden and risks related to the administration of prostacyclin therapy are probably contributing factors."""

text6 = """Selexipag is an oral selective IP prostacyclin-receptor agonist that is structurally distinct from prostacyclin. In a placebo-controlled, phase 2 trial involving patients who were already receiving treatment for pulmonary arterial hypertension, selexipag increased the cardiac index (at week 17, the treatment effect for the placebo-corrected change from baseline was an increase of 0.5 liters per minute per square meter of body-surface area) and significantly reduced pulmonary vascular resistance by 33% at week 17. We conducted an event-driven, phase 3 trial, the Prostacyclin (PGI2) Receptor Agonist In Pulmonary Arterial Hypertension (GRIPHON) study, to investigate the safety and efficacy of selexipag in patients with pulmonary arterial hypertension who were not receiving therapy at baseline and those who were already receiving one or two therapies for the disease at baseline."""


In [2]:
text7 = """Study Design The GRIPHON study was a multicenter, double-blind, randomized, parallel-group, placebo-controlled, event-driven, phase 3 study. The steering committee, in collaboration with the sponsor (Actelion Pharmaceuticals), designed the trial and oversaw its conduct and the analyses of the data. The study protocol, which is available with the full text of this article at NEJM.org, was approved by the review board or ethics committee at each participating site. The study was monitored by an independent data and safety monitoring committee (see the Supplementary Appendix, available at NEJM.org). The collection, management, and analysis of the data were performed by the sponsor according to a pre-specified statistical analysis plan (available with the protocol) that was reviewed by two independent academic statisticians. All drafts of themanuscript were written by the first author and the last two (senior) authors, as well as the three authors affiliated with the sponsor, and were reviewed and edited by all the authors. The steering committee members, all of whom are authors of this article, and the three authors affiliated with Actelion Pharmaceuticals were involved in the decision to submit the manuscript for publication. All the authors had access to the data and vouch for the accuracy and completeness of the analyses and for the fidelity of this report to the study protocol."""

text8 = """Selection of Patients The study population included patients 18 to 75 years of age who had idiopathic or heritable pulmonary arterial hypertension or pulmonary arterial hypertension associated with human immunodeficiency virus infection, drug use or toxin exposure, connective tissue disease, or repaired congenital systemic-to-pulmonary shunts. Confirmation of the diagnosis by means of right heart catheterization was required before screening. Patients were required to have a pulmonary vascular resistance of at least 5 Wood units (400 dyn sec cm−5) and a 6-minute walk distance of 50 to 450 m. Patients who were not receiving treatment for pulmonary arterial hypertension and those who were receiving an endothelin-receptor antagonist, a phosphodiesterase type 5 inhibitor, or both at a dose that had been stable for at least 3 months were eligible for enrollment; patients who were receiving prostacyclin analogues were not eligible. Written informed consent was obtained from all the patients."""

text9 = """Trial Procedures Within 28 days after screening, patients were randomly assigned, in a 1:1 ratio (with stratification according to study center), to receive placebo or selexipag. During the 12-week dose-adjustment phase, selexipag was initiated at a dose of 200 mg twice daily and was increased weekly in twice-daily increments of 200 mg until unmanageable adverse effects associated with prostacyclin use, such as headache or jaw pain, developed (Fig. S1 in the Supplementary Appendix). The dose was then decreased by 200 mg in both daily doses, and this reduced dose was considered to be the maximum tolerated dose for that patient. The maximum dose allowed was 1600 mg twice daily. After 12 weeks, patients entered the maintenance phase of the study. Starting at week 26, doses could be increased at scheduled visits; dose reductions were allowed at any time. The individualized maintenance dose was defined as the dose that a patient received for the longest duration."""

text10 = """Selexipag and placebo were administered in a double-blind fashion. The end of the treatment period was defined for each patient as 7 days after the last intake of selexipag or placebo (Fig. S2 in the Supplementary Appendix). As outlined in Figure 1, the end of the treatment period occurred at the end of the study (for patients who did not have a primary end-point event), after the occurrence of a primary endpoint event, or prematurely for various reasons, such as an adverse event. The end of the study was declared when the prespecified number of primary end-point events in the study population was reached (see the Statistical Analysis section below)."""

text11 = "Clinical assessments that included the 6-minute walk distance and determination of the World Health Organization (WHO) functional class were performed and laboratory data were collected at screening, at baseline, at weeks 8, 16, and 26, and every 6 months thereafter and when worsening of the disease was suspected. Adverse events and serious adverse events were recorded throughout the treatment period and up to 7 days (for adverse events) and 30 days (for serious adverse events) after the last intake of selexipag or placebo. Vital status was recorded at the end of the study."

text12 = """Patients who discontinued selexipag or placebo during the double-blind phase of the study and provided written informed consent for further follow-up were followed during a blinded post-treatment observation period up to the end of the study (see Section 7 in the Supplementary Appendix). Patients who had a nonfatal primary end-point event discontinued the double-blind regimen and were eligible to receive open-label selexipag or commercially available drugs; patients who continued to re- ceive selexipag or placebo throughout the double-blind phase were also eligible to receive open-label selexipag or commercially available drugs at the end of the study. The commercially available drugs represented the local standard of care and were not paid for by the sponsor."""

text13 = """OutcomeMeasures The primary end point in a time-to-event analysis was a composite of death or a complication related to pulmonary arterial hypertension, whichever occurred first, up to the end of the treatment period. Complications related to pulmonary arterial hypertension were disease progression or worsening of pulmonary arterial hypertension that resulted in hospitalization, initiation of parenteral prostanoid therapy or long-term oxygen therapy, or the need for lung transplantation or balloon atrial septostomy as judged by the physician. (Placement on a transplant waiting list represented an acute measure, as confirmed by the critical-event committee, and an actual lung transplantation would also meet this criterion.) Disease progression was defined as a decrease from baseline of at least 15% in the 6-minute walk distance (confirmed by means of a second test on a different day) accompanied by a worsening in WHO functional class (for the patients with WHO functional class II or III at baseline) or the need for additional treatment of pulmonary arterial hypertension (for the patients with WHO functional class III or IV at baseline). An independent critical-event committee whose members were unaware of the study-group assignments adjudicated all events up to the end of the study, including each death, to determine whether it was due to pulmonary arterial hypertension."""

text14 = """Secondary end points, listed in the order of the testing hierarchy, included the change in the 6-minute walk distance from baseline to week 26 (measured at trough levels of the study drug), the absence of worsening of WHO functional class from baseline to week 26, and death due to pulmonary arterial hypertension or hospitalization for worsening of pulmonary arterial hypertension up to the end of treatment period and death from any cause up to the end of the study (both analyzed in a time-to-event analysis). The change in N-terminal pro–brain natriuretic peptide (NT-proBNP) level from baseline to week 26 was analyzed as an exploratory end point. Safety end points included adverse events and abnormal results from laboratory studies."""

text15 = """Statistical Analysis We initially estimated that 202 primary end-point events would be needed for the study to have 90% power to detect a hazard ratio for the primary end point with selexipag, as compared with placebo, of 0.57 over an estimated study duration of 3.5 years, assuming a hazard rate of 0.22 per year in the placebo group, at a one-sided type 1 error rate of 0.005. We calculated that to reach that number of primary end-point events, we would need to enroll 670 patients over the course of 2 years, assuming an annual rate of attrition of 5%. Twenty months after the study was initiated, a blinded review of baseline data from 154 patients indicated that more patients than expected were receiving background therapy for their disease. Therefore, the hypothesized hazard ratio was changed from 0.57 to 0.65 to reflect a lower anticipated treatment effect. To preserve the type 1 and type 2 error rates and the study duration, the required number of primary end-point events was increased to 331 and the required number of patients was increased to 1150. An independent data and safety monitoring committee performed an interim analysis, which had been planned after 202 events had occurred, with stopping rules for futility and efficacy that were based on Haybittle–Peto boundaries. The final analysis used a one-sided significance level of 0.00499."""


In [3]:
text16 = """The primary end-point analysis was an ontreatment analysis with follow-up data censored at the time selexipag or placebo was discontinued. Secondary end points were tested hierarchically to control for multiplicity. In time-to-event analyses, end points were estimated with the use of the Kaplan–Meier method and were analyzed with the use of the log-rank test. Hazard ratios with 99% confidence intervals (for primary and secondary end points) and 95% confidence intervals (for exploratory end points) were estimated with the use of proportional-hazard models. Sensitivity analyses were performed to account for premature discontinuations of placebo or selexipag, and an analysis of the primary end point was performed that excluded the 45 events that occurred before the sample size was increased (see Section 8 in the Supplementary Appendix). We also performed subgroup analyses that included interaction tests.13 In addition, the primary end point was analyzed according to prespecified dose strata: low doses (200 or 400 mg twice daily), medium doses (600, 800, or 1000 mg twice daily), and high doses (1200, 1400, or 1600 mg twice daily)."""

text17 = """At week 26, the changes from baseline in the 6-minute walk distance and in the NT-proBNP level were analyzed with use of a nonparametric analysis of covariance that was adjusted for the baseline value; the proportion of patients who did not have a worsening in WHO functional class was assessed with the use of a nonparametric analysis of covariance that was adjusted for the baseline value and a Cochran–Mantel–Haenszel test stratified according to the baseline value. Missing data for the 6-minute walk ditance and WHO functional class were imputed according to a worst-case scenario (see Section 9 in the Supplementary Appendix). The analysis of NT-proBNP levels was performed with the use of observed data."""

text18 = """Patients A total of 1156 patients were enrolled at 181 centers in 39 countries from December 2009 through May 2013 and were randomly assigned to receive placebo (582 patients) or selexipag (574 patients) (Fig. 1). The patients in the placebo group received placebo for a median duration of 63.7 weeks, and the patients in the selexipag group received selexipag for a median duration of 70.7 weeks. The baseline characteristics of the patients are shown in Table 1. Of the 351 patients who discontinued placebo or selexipag after a nonfatal primary end-point event, 170 provided consent for follow-up during the post-treatment observation period (111 in the placebo group and 59 in the selexipag group); of the 218 patients who discontinued placebo or selexipag prematurely without having a primary end-point event, 80 provided consent for follow-up during the post-treatment observation period (26 in the placebo group and 54 in the selexipag group) (see Section 8 in the Supplementary Appendix). Vital status was reported for 1101 patients (95.2%) at the end of the study. """

text19 = """Primary End Point Overall, 397 patients had a primary end-point event (242 patients [41.6%] in the placebo group and 155 patients [27.0%] in the selexipag group). The hazard ratio for a primary end-point event in the selexipag group was 0.60 (99% confidence interval [CI], 0.46 to 0.78; P<0.001) (Fig. 2). Disease progression and hospitalization accounted for 81.9% of the events (Table 2). The results of sensitivity analyses that were performed to account for premature discontinuations and of an analysis that excluded events that occurred before the sample size was increased were consistent with the results of the primary analysis (Table S1 and Fig. S3 in the Supplementary Appendix). A total of 133 patients (23.2%) received a maintenance dose of selexipag in the low-dose stratum, 179 (31.2%) received a maintenance dose in the medium-dose stratum, and 246 (42.9%) received a maintenance dose in the high-dose stratum (Table S2 in the Supplemen- tary Appendix). The effect of selexipag with respect to the primary end point was consistent across these strata (Fig. S4 in the Supplementary Appendix). The treatment effect with respect to the primary end point was also consistent in the prespecified patient subgroups, with nonsignificant P values for interaction, including in the subgroup of patients who were already receiving two therapies for pulmonary arterial hypertension at baseline (Fig. S5 in the Supplementary Appendix)."""

text20 = """Secondary and Exploratory End Points Missing values were imputed for 21.6% of the patients in the analysis of 6-minute walk distance and for 18.3% of the patients in the analysis of WHO functional class. At week 26, the 6-minute walk distance had decreased by a median of 9.0 m from baseline in the placebo group and had increased by 4.0 m from baseline in the selexipag group (treatment effect, 12.0 m; 99% CI, 1 to 24; P=0.003). At week 26, there was no significant difference between the placebo group and the selexipag group in the proportion of patients with no worsening in WHO functional class (74.9% and 77.8%, respectively; odds ratio, 1.16; 99% CI, 0.81 to 1.66; P=0.28) (Table S3 in the Supplementary Appendix)."""

text21 = """On the basis of the testing hierarchy, the following results should be interpreted as exploratory. By the end of the treatment period, death due to pulmonary arterial hypertension or hospitalization for worsening of pulmonary arterial hypertension had occurred in 137 patients (23.5%) in the placebo group and in 102 patients (17.8%) in the selexipag group (hazard ratio in the selexipag group, 0.70; 95% CI, 0.54 to 0.91; P=0.003); 87.4% of these events were hospitalizations (Table 2). By the end of the study, death from any cause had occurred in 105 patients (18.0%) in the placebo group and in 100 patients (17.4%) in the selexipag group (hazard ratio in the selexipag group, 0.97; 95% CI, 0.74 to 1.28; P=0.42). Findings from a sensitivity analysis that assumed that patients with unknown vital status had died (4.8% of patients) were consistent with the findings of the main analysis of death from any cause (Table S4 in the Supple- mentary Appendix). At week 26, NT-proBNP levels were significantly lower in the selexipag group than in the placebo group (Table S5 in the Supplementary Appendix)."""

text22 = """Safety and Adverse Events Overall, 41 patients (7.1%) in the placebo group and 82 patients (14.3%) in the selexipag group discontinued their study regimen prematurely because of an adverse event (Table 3). The most frequent adverse events leading to discontinuation in the selexipag group (events for which there was >1% difference between the selexipag and placebo groups) were headache (in 3.3% of the patients), diarrhea (in 2.3%), and nausea (in 1.7%). Hyperthyroidism occurred in 8 patients in the selexipag group and led to treatment discontinuation in 1 patient. No serious adverse events were reported more frequently (i.e., at a rate >1% higher) in the selexipag group than in the placebo group. Table 3 lists the most frequent adverse events reported overall. The most frequent adverse events associated with prostacyclin use that were reported during the doseadjustment and maintenance phases are listed in Table S6 in the Supplementary Appendix. Adverse events associated with prostacyclin occurred more frequently during the dose-adjustment phase, when they were used to define the individualized maximum tolerated dose."""


In [4]:
from predict import get_ner_predictions, get_re_predictions
from utils import display_ehr, get_long_relation_table, display_knowledge_graph, get_relation_table


In [5]:
ner_predictions1 = get_ner_predictions(
    ehr_record=text1,
    model_name="biobert")
ner_predictions1.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 31 40
 Entity text: 'selexipag',
 
 ID: T1
 Entity name: Reason
 Character range: 141 172
 Entity text: 'pulmonary arterial hypertension']

In [6]:
re_predictions1 = get_re_predictions(ner_predictions1)
re_predictions1.relations

[]

In [7]:
ner_predictions2 = get_ner_predictions(
    ehr_record=text2,
    model_name="biobert")
ner_predictions2.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Reason
 Character range: 131 162
 Entity text: 'pulmonary arterial hypertension',
 
 ID: T1
 Entity name: Drug
 Character range: 174 181
 Entity text: 'placebo',
 
 ID: T2
 Entity name: Drug
 Character range: 185 194
 Entity text: 'selexipag',
 
 ID: T3
 Entity name: Strength
 Character range: 234 241
 Entity text: '1600 μg',
 
 ID: T4
 Entity name: Frequency
 Character range: 242 254
 Entity text: 'twice daily)',
 
 ID: T5
 Entity name: Drug
 Character range: 472 481
 Entity text: 'inhibitor',
 
 ID: T6
 Entity name: Reason
 Character range: 583 614
 Entity text: 'pulmonary arterial hypertension',
 
 ID: T7
 Entity name: Drug
 Character range: 726 735
 Entity text: 'selexipag',
 
 ID: T8
 Entity name: Drug
 Character range: 739 747
 Entity text: 'placebo)']

In [8]:
re_predictions2 = get_re_predictions(ner_predictions2)
re_predictions2.relations

Prediction:   0%|          | 0/2 [00:00<?, ?it/s]

[]

In [9]:
ner_predictions3 = get_ner_predictions(
    ehr_record=text3,
    model_name="biobert")
ner_predictions3.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 113 122
 Entity text: 'selexipag',
 
 ID: T1
 Entity name: Drug
 Character range: 150 159
 Entity text: 'selexipag',
 
 ID: T2
 Entity name: Drug
 Character range: 348 357
 Entity text: 'selexipag',
 
 ID: T3
 Entity name: Drug
 Character range: 730 739
 Entity text: 'selexipag',
 
 ID: T4
 Entity name: ADE
 Character range: 750 754
 Entity text: 'died']

In [10]:
re_predictions3 = get_re_predictions(ner_predictions3)
re_predictions3.relations

[]

In [11]:
ner_predictions3a = get_ner_predictions(
    ehr_record=text3a,
    model_name="biobert")
ner_predictions3a.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 76 85
 Entity text: 'selexipag',
 
 ID: T1
 Entity name: Drug
 Character range: 206 215
 Entity text: 'selexipag',
 
 ID: T2
 Entity name: Drug
 Character range: 272 284
 Entity text: 'prostacyclin',
 
 ID: T3
 Entity name: ADE
 Character range: 296 304
 Entity text: 'headache',
 
 ID: T4
 Entity name: ADE
 Character range: 306 314
 Entity text: 'diarrhea',
 
 ID: T5
 Entity name: ADE
 Character range: 316 322
 Entity text: 'nausea',
 
 ID: T6
 Entity name: ADE
 Character range: 328 336
 Entity text: 'jaw pain']

In [12]:
re_predictions3a = get_re_predictions(ner_predictions3a)
re_predictions3a.relations

[]

In [13]:
ner_predictions4 = get_ner_predictions(
    ehr_record=text4,
    model_name="biobert")
ner_predictions4.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: ADE
 Character range: 147 178
 Entity text: 'pulmonary arterial hypertension',
 
 ID: T1
 Entity name: Drug
 Character range: 208 217
 Entity text: 'selexipag']

In [14]:
re_predictions4 = get_re_predictions(ner_predictions4)
re_predictions4.relations

[]

In [15]:
ner_predictions5 = get_ner_predictions(
    ehr_record=text5,
    model_name="biobert")
ner_predictions5.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: ADE
 Character range: 0 31
 Entity text: 'Pulmonary arterial hypertension',
 
 ID: T1
 Entity name: Drug
 Character range: 227 239
 Entity text: 'prostacyclin',
 
 ID: T2
 Entity name: Route
 Character range: 274 285
 Entity text: 'intravenous',
 
 ID: T3
 Entity name: Drug
 Character range: 286 298
 Entity text: 'prostacyclin',
 
 ID: T4
 Entity name: Reason
 Character range: 327 336
 Entity text: 'pulmonary',
 
 ID: T5
 Entity name: Reason
 Character range: 346 358
 Entity text: 'hypertension',
 
 ID: T6
 Entity name: Drug
 Character range: 456 468
 Entity text: 'prostacyclin']

In [16]:
re_predictions5 = get_re_predictions(ner_predictions5)
re_predictions5.relations

[]

In [17]:
ner_predictions6 = get_ner_predictions(
    ehr_record=text6,
    model_name="biobert")
ner_predictions6.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 0 9
 Entity text: 'Selexipag',
 
 ID: T1
 Entity name: Drug
 Character range: 99 112
 Entity text: 'prostacyclin.',
 
 ID: T2
 Entity name: Drug
 Character range: 245 254
 Entity text: 'selexipag',
 
 ID: T3
 Entity name: ADE
 Character range: 255 282
 Entity text: 'increased the cardiac index',
 
 ID: T4
 Entity name: ADE
 Character range: 371 379
 Entity text: 'increase',
 
 ID: T5
 Entity name: ADE
 Character range: 462 499
 Entity text: 'reduced pulmonary vascular resistance',
 
 ID: T6
 Entity name: Drug
 Character range: 699 708
 Entity text: 'selexipag']

In [18]:
re_predictions6 = get_re_predictions(ner_predictions6)
re_predictions6.relations

Prediction:   0%|          | 0/2 [00:00<?, ?it/s]

[
 ID: R1
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T0
 Entity name: Drug
 Character range: 0 9
 Entity text: 'Selexipag'
 
 Entity 2: 
 
 ID: T3
 Entity name: ADE
 Character range: 255 282
 Entity text: 'increased the cardiac index',
 
 ID: R2
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T0
 Entity name: Drug
 Character range: 0 9
 Entity text: 'Selexipag'
 
 Entity 2: 
 
 ID: T4
 Entity name: ADE
 Character range: 371 379
 Entity text: 'increase',
 
 ID: R3
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T0
 Entity name: Drug
 Character range: 0 9
 Entity text: 'Selexipag'
 
 Entity 2: 
 
 ID: T5
 Entity name: ADE
 Character range: 462 499
 Entity text: 'reduced pulmonary vascular resistance',
 
 ID: R4
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T1
 Entity name: Drug
 Character range: 99 112
 Entity text: 'prostacyclin.'
 
 Entity 2: 
 
 ID: T3
 Entity name: ADE
 Character range: 255 282
 Entity text: 'increased the cardiac index',
 
 ID: R5
 Relation type: ADE-Drug
 
 Entity 1

In [19]:
ner_predictions7 = get_ner_predictions(
    ehr_record=text7,
    model_name="biobert")
ner_predictions7.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[]

In [20]:
re_predictions7 = get_re_predictions(ner_predictions7)
re_predictions7.relations

[]

### Oleg's Highlight:
18 to 75 years of age ... idiopathic or heritable pulmonary arterial hypertension or pulmonary arterial hypertension associated with human immunodeficiency virus infection, drug use or toxin exposure, connective tissue disease, or repaired congenital systemic-to-pulmonary shunts.

In [21]:
ner_predictions8 = get_ner_predictions(
    ehr_record=text8,
    model_name="biobert")
ner_predictions8.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 719 740
 Entity text: '-receptor antagonist,',
 
 ID: T1
 Entity name: Drug
 Character range: 743 778
 Entity text: 'phosphodiesterase type 5 inhibitor,',
 
 ID: T2
 Entity name: Drug
 Character range: 898 910
 Entity text: 'prostacyclin']

In [22]:
re_predictions8 = get_re_predictions(ner_predictions8)
re_predictions8.relations

[]

In [23]:
ner_predictions9 = get_ner_predictions(
    ehr_record=text9,
    model_name="biobert")
ner_predictions9.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 157 164
 Entity text: 'placebo',
 
 ID: T1
 Entity name: Drug
 Character range: 168 178
 Entity text: 'selexipag.',
 
 ID: T2
 Entity name: Duration
 Character range: 190 192
 Entity text: '12',
 
 ID: T3
 Entity name: Drug
 Character range: 221 230
 Entity text: 'selexipag',
 
 ID: T4
 Entity name: Strength
 Character range: 258 264
 Entity text: '200 mg',
 
 ID: T5
 Entity name: Frequency
 Character range: 265 276
 Entity text: 'twice daily',
 
 ID: T6
 Entity name: Strength
 Character range: 331 337
 Entity text: '200 mg',
 
 ID: T7
 Entity name: Drug
 Character range: 389 401
 Entity text: 'prostacyclin',
 
 ID: T8
 Entity name: ADE
 Character range: 415 423
 Entity text: 'headache',
 
 ID: T9
 Entity name: ADE
 Character range: 427 435
 Entity text: 'jaw pain',
 
 ID: T10
 Entity name: Strength
 Character range: 519 525
 Entity text: '200 mg',
 
 ID: T11
 Entity name: Strength
 Character range: 664 671
 Entity text: '1600 mg',
 
 ID: 

In [24]:
re_predictions9 = get_re_predictions(ner_predictions9)
re_predictions9.relations

Prediction:   0%|          | 0/4 [00:00<?, ?it/s]

[
 ID: R1
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T1
 Entity name: Drug
 Character range: 168 178
 Entity text: 'selexipag.'
 
 Entity 2: 
 
 ID: T8
 Entity name: ADE
 Character range: 415 423
 Entity text: 'headache',
 
 ID: R2
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T3
 Entity name: Drug
 Character range: 221 230
 Entity text: 'selexipag'
 
 Entity 2: 
 
 ID: T9
 Entity name: ADE
 Character range: 427 435
 Entity text: 'jaw pain',
 
 ID: R3
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T7
 Entity name: Drug
 Character range: 389 401
 Entity text: 'prostacyclin'
 
 Entity 2: 
 
 ID: T8
 Entity name: ADE
 Character range: 415 423
 Entity text: 'headache',
 
 ID: R4
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T7
 Entity name: Drug
 Character range: 389 401
 Entity text: 'prostacyclin'
 
 Entity 2: 
 
 ID: T9
 Entity name: ADE
 Character range: 427 435
 Entity text: 'jaw pain']

In [25]:
ner_predictions10 = get_ner_predictions(
    ehr_record=text10,
    model_name="biobert")
ner_predictions10.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 0 9
 Entity text: 'Selexipag',
 
 ID: T1
 Entity name: Drug
 Character range: 14 21
 Entity text: 'placebo',
 
 ID: T2
 Entity name: Drug
 Character range: 163 172
 Entity text: 'selexipag',
 
 ID: T3
 Entity name: Drug
 Character range: 176 183
 Entity text: 'placebo']

In [26]:
re_predictions10 = get_re_predictions(ner_predictions10)
re_predictions10.relations

[]

In [27]:
ner_predictions11 = get_ner_predictions(
    ehr_record=text11,
    model_name="biobert")
ner_predictions11.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 508 517
 Entity text: 'selexipag']

In [28]:
re_predictions11 = get_re_predictions(ner_predictions11)
re_predictions11.relations

[]

In [29]:
ner_predictions12 = get_ner_predictions(
    ehr_record=text12,
    model_name="biobert")
ner_predictions12.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 26 35
 Entity text: 'selexipag',
 
 ID: T1
 Entity name: Drug
 Character range: 39 46
 Entity text: 'placebo',
 
 ID: T2
 Entity name: Drug
 Character range: 419 428
 Entity text: 'selexipag',
 
 ID: T3
 Entity name: Drug
 Character range: 498 507
 Entity text: 'selexipag',
 
 ID: T4
 Entity name: Drug
 Character range: 511 518
 Entity text: 'placebo',
 
 ID: T5
 Entity name: Drug
 Character range: 594 603
 Entity text: 'selexipag']

In [30]:
re_predictions12 = get_re_predictions(ner_predictions12)
re_predictions12.relations

[]

### Oleg's highlight
primary end point in a time-to-event analy- sis was a composite of death or a complication related to pulmonary arterial hypertension,

In [31]:
ner_predictions13 = get_ner_predictions(
    ehr_record=text13,
    model_name="biobert")
ner_predictions13.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Reason
 Character range: 335 347
 Entity text: 'hypertension',
 
 ID: T1
 Entity name: Drug
 Character range: 407 417
 Entity text: 'prostanoid',
 
 ID: T2
 Entity name: Drug
 Character range: 439 445
 Entity text: 'oxygen']

In [32]:
re_predictions13 = get_re_predictions(ner_predictions13)
re_predictions13.relations

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[]

In [33]:
ner_predictions14 = get_ner_predictions(
    ehr_record=text14,
    model_name="biobert")
ner_predictions14.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 560 567
 Entity text: 'peptide']

In [34]:
re_predictions14 = get_re_predictions(ner_predictions14)
re_predictions14.relations

[]

In [35]:
ner_predictions15 = get_ner_predictions(
    ehr_record=text15,
    model_name="biobert")
ner_predictions15.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: ADE
 Character range: 159 176
 Entity text: 'primary end point',
 
 ID: T1
 Entity name: Drug
 Character range: 182 192
 Entity text: 'selexipag,']

In [36]:
re_predictions15 = get_re_predictions(ner_predictions15)
re_predictions15.relations

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: R1
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T1
 Entity name: Drug
 Character range: 182 192
 Entity text: 'selexipag,'
 
 Entity 2: 
 
 ID: T0
 Entity name: ADE
 Character range: 159 176
 Entity text: 'primary end point']

In [37]:
ner_predictions16 = get_ner_predictions(
    ehr_record=text16,
    model_name="biobert")
ner_predictions16.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 100 109
 Entity text: 'selexipag',
 
 ID: T1
 Entity name: Drug
 Character range: 113 120
 Entity text: 'placebo',
 
 ID: T2
 Entity name: Drug
 Character range: 643 650
 Entity text: 'placebo',
 
 ID: T3
 Entity name: Drug
 Character range: 654 664
 Entity text: 'selexipag,',
 
 ID: T6
 Entity name: Frequency
 Character range: 1078 1083
 Entity text: 'twice',
 
 ID: T9
 Entity name: Frequency
 Character range: 1131 1136
 Entity text: 'twice']

In [38]:
re_predictions16 = get_re_predictions(ner_predictions16)
re_predictions16.relations

[]

In [39]:
ner_predictions17 = get_ner_predictions(
    ehr_record=text17,
    model_name="biobert")
ner_predictions17.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 79 88
 Entity text: 'NT-proBNP',
 
 ID: T1
 Entity name: Drug
 Character range: 649 658
 Entity text: 'NT-proBNP']

In [40]:
re_predictions17 = get_re_predictions(ner_predictions17)
re_predictions17.relations

[]

In [41]:
ner_predictions18 = get_ner_predictions(
    ehr_record=text18,
    model_name="biobert")
ner_predictions18.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 153 160
 Entity text: 'placebo',
 
 ID: T1
 Entity name: Drug
 Character range: 179 188
 Entity text: 'selexipag',
 
 ID: T2
 Entity name: Drug
 Character range: 257 264
 Entity text: 'placebo',
 
 ID: T3
 Entity name: Duration
 Character range: 290 301
 Entity text: '63.7 weeks,',
 
 ID: T4
 Entity name: Drug
 Character range: 326 335
 Entity text: 'selexipag',
 
 ID: T5
 Entity name: Drug
 Character range: 351 360
 Entity text: 'selexipag',
 
 ID: T6
 Entity name: Duration
 Character range: 386 397
 Entity text: '70.7 weeks.',
 
 ID: T7
 Entity name: Drug
 Character range: 502 509
 Entity text: 'placebo',
 
 ID: T8
 Entity name: Drug
 Character range: 513 522
 Entity text: 'selexipag',
 
 ID: T9
 Entity name: Drug
 Character range: 685 694
 Entity text: 'selexipag',
 
 ID: T10
 Entity name: Drug
 Character range: 740 747
 Entity text: 'placebo',
 
 ID: T11
 Entity name: Drug
 Character range: 751 760
 Entity text: 'selexipag',
 
 ID: T12

In [42]:
re_predictions18 = get_re_predictions(ner_predictions18)
re_predictions18.relations

Prediction:   0%|          | 0/2 [00:00<?, ?it/s]

[]

### Oleg's highlight
primary end-point [27.0%] in the selexipag group).

In [43]:
ner_predictions19 = get_ner_predictions(
    ehr_record=text19,
    model_name="biobert")
ner_predictions19.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 145 154
 Entity text: 'selexipag',
 
 ID: T1
 Entity name: Drug
 Character range: 217 226
 Entity text: 'selexipag',
 
 ID: T2
 Entity name: Drug
 Character range: 754 763
 Entity text: 'selexipag',
 
 ID: T3
 Entity name: Drug
 Character range: 984 993
 Entity text: 'selexipag',
 
 ID: T4
 Entity name: Reason
 Character range: 1355 1386
 Entity text: 'pulmonary arterial hypertension']

In [44]:
re_predictions19 = get_re_predictions(ner_predictions19)
re_predictions19.relations

[]

### Oleg's highlight:
At week 26, ... 6-minute walk distance ... me- dian ... increased by 4.0 m from baseline in the selexipag group 

In [45]:
ner_predictions20 = get_ner_predictions(
    ehr_record=text20,
    model_name="biobert")
ner_predictions20.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: ADE
 Character range: 220 256
 Entity text: '6-minute walk distance had decreased',
 
 ID: T1
 Entity name: Drug
 Character range: 361 370
 Entity text: 'selexipag',
 
 ID: T2
 Entity name: Drug
 Character range: 513 522
 Entity text: 'selexipag']

In [46]:
re_predictions20 = get_re_predictions(ner_predictions20)
re_predictions20.relations

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: R1
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T1
 Entity name: Drug
 Character range: 361 370
 Entity text: 'selexipag'
 
 Entity 2: 
 
 ID: T0
 Entity name: ADE
 Character range: 220 256
 Entity text: '6-minute walk distance had decreased']

In [47]:
ner_predictions21 = get_ner_predictions(
    ehr_record=text21,
    model_name="biobert")
ner_predictions21.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: ADE
 Character range: 148 166
 Entity text: 'pulmonary arterial',
 
 ID: T1
 Entity name: Drug
 Character range: 341 350
 Entity text: 'selexipag',
 
 ID: T2
 Entity name: Drug
 Character range: 378 387
 Entity text: 'selexipag',
 
 ID: T3
 Entity name: ADE
 Character range: 513 518
 Entity text: 'death',
 
 ID: T4
 Entity name: Drug
 Character range: 627 636
 Entity text: 'selexipag',
 
 ID: T5
 Entity name: Drug
 Character range: 664 673
 Entity text: 'selexipag',
 
 ID: T6
 Entity name: Drug
 Character range: 971 980
 Entity text: 'NT-proBNP',
 
 ID: T7
 Entity name: Drug
 Character range: 1020 1029
 Entity text: 'selexipag']

In [48]:
re_predictions21 = get_re_predictions(ner_predictions21)
re_predictions21.relations

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: R1
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T1
 Entity name: Drug
 Character range: 341 350
 Entity text: 'selexipag'
 
 Entity 2: 
 
 ID: T0
 Entity name: ADE
 Character range: 148 166
 Entity text: 'pulmonary arterial',
 
 ID: R2
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T2
 Entity name: Drug
 Character range: 378 387
 Entity text: 'selexipag'
 
 Entity 2: 
 
 ID: T0
 Entity name: ADE
 Character range: 148 166
 Entity text: 'pulmonary arterial',
 
 ID: R3
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T4
 Entity name: Drug
 Character range: 627 636
 Entity text: 'selexipag'
 
 Entity 2: 
 
 ID: T3
 Entity name: ADE
 Character range: 513 518
 Entity text: 'death',
 
 ID: R4
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T5
 Entity name: Drug
 Character range: 664 673
 Entity text: 'selexipag'
 
 Entity 2: 
 
 ID: T3
 Entity name: ADE
 Character range: 513 518
 Entity text: 'death']

In [49]:
ner_predictions22 = get_ner_predictions(
    ehr_record=text22,
    model_name="biobert")
ner_predictions22.get_entities()

Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[
 ID: T0
 Entity name: Drug
 Character range: 106 115
 Entity text: 'selexipag',
 
 ID: T1
 Entity name: Drug
 Character range: 273 282
 Entity text: 'selexipag',
 
 ID: T2
 Entity name: Drug
 Character range: 344 353
 Entity text: 'selexipag',
 
 ID: T3
 Entity name: ADE
 Character range: 379 387
 Entity text: 'headache',
 
 ID: T4
 Entity name: ADE
 Character range: 415 423
 Entity text: 'diarrhea',
 
 ID: T5
 Entity name: ADE
 Character range: 439 445
 Entity text: 'nausea',
 
 ID: T6
 Entity name: ADE
 Character range: 457 472
 Entity text: 'Hyperthyroidism',
 
 ID: T7
 Entity name: Drug
 Character range: 503 512
 Entity text: 'selexipag',
 
 ID: T8
 Entity name: Drug
 Character range: 662 671
 Entity text: 'selexipag',
 
 ID: T9
 Entity name: Drug
 Character range: 819 831
 Entity text: 'prostacyclin',
 
 ID: T10
 Entity name: Drug
 Character range: 989 1001
 Entity text: 'prostacyclin']

In [50]:
re_predictions22 = get_re_predictions(ner_predictions22)
re_predictions22.relations

Prediction:   0%|          | 0/2 [00:00<?, ?it/s]

[
 ID: R1
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T0
 Entity name: Drug
 Character range: 106 115
 Entity text: 'selexipag'
 
 Entity 2: 
 
 ID: T6
 Entity name: ADE
 Character range: 457 472
 Entity text: 'Hyperthyroidism',
 
 ID: R2
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T1
 Entity name: Drug
 Character range: 273 282
 Entity text: 'selexipag'
 
 Entity 2: 
 
 ID: T6
 Entity name: ADE
 Character range: 457 472
 Entity text: 'Hyperthyroidism',
 
 ID: R3
 Relation type: ADE-Drug
 
 Entity 1: 
 
 ID: T2
 Entity name: Drug
 Character range: 344 353
 Entity text: 'selexipag'
 
 Entity 2: 
 
 ID: T6
 Entity name: ADE
 Character range: 457 472
 Entity text: 'Hyperthyroidism']