In [0]:
import pandas as pd

from check_correctness import get_create_training_set, LOCAL_FILES, explain_negative_examples

curations_file = LOCAL_FILES.joinpath("indra_assembly_curations.json")
corpus = LOCAL_FILES.joinpath("indra_assembly_curations.json")

# The creation of the dataframe from the statements takes ~90 mins. If the output file already is cached, it will load from the cache instead.
training_data_df = get_create_training_set(curations_file=curations_file, statement_json_file=corpus)

res_dict = explain_negative_examples(training_data_df=training_data_df)

In [7]:
from IPython.display import display

qa_df = pd.DataFrame(res_dict["chat_qa"])

display(qa_df.style.set_properties(**{
    'white-space': 'pre-wrap',
}))

qa_df.style.set_properties(subset=['ad_description'], **{'width-min': '300px'})
qa_df

Unnamed: 0,prompt,response,curation_tag
0,"Here is a sentence and a statement: sentence: ""Furthermore, upon Ang-1 stimulation p85beta co-immunoprecipitated with Tie-2 and was tyrosine phosphorylated, suggesting that p85beta directly interact with Tie-2 (d)."" statement: ""TM7SF2 activates PIK3R2."" Synonyms in the sentence and statement, respectively: ""Ang-1"" and ""TM7SF2"" ""p85beta"" and ""PIK3R2"" Is the statement implied by the sentence? If it isn't, please explain why.","No, the statement is not implied by the sentence. The sentence discusses the interaction between Ang-1 stimulation and p85beta with Tie-2, while the statement discusses the activation of PIK3R2 by TM7SF2. There is no direct connection or mention of TM7SF2 or PIK3R2 in the sentence.",grounding
1,"Here is a sentence and a statement: sentence: ""Here we show that TNRC6A interacts with eIF4E2, a homologue of eIF4E that can bind to the cap but cannot interact with eIF4G to initiate translation, to inhibit the translation of target mRNAs."" statement: ""EIF4G binds EIF4E2."" Is the statement implied by the sentence? If it isn't, please explain why.","No, the statement ""EIF4G binds EIF4E2"" is not implied by the sentence. The sentence only states that TNRC6A interacts with eIF4E2, and that eIF4E2 can bind to the cap but cannot interact with eIF4G to initiate translation. There is no mention of EIF4G binding to EIF4E2 in the sentence.",negative_result
2,"Here is a sentence and a statement: sentence: ""Hypoxia increased HIF-1α and iNOS expression in HaCaT cells and inhibition of HIF-1α expression decreased iNOS expression."" statement: ""NOS2 increases the amount of NOS2."" Synonyms in the sentence and statement, respectively: ""iNOS"" and ""NOS2"" ""iNOS"" and ""NOS2"" Is the statement implied by the sentence? If it isn't, please explain why.","No, the statement is not implied by the sentence. The sentence discusses the relationship between hypoxia, HIF-1α, and iNOS expression in HaCaT cells, while the statement simply states that NOS2 increases the amount of NOS2, without any context or explanation of how it relates to the sentence.",no_relation
3,"Here is a sentence and a statement: sentence: ""Furthermore, apoA-V-enriched HDL was associated with enhanced LCAT activity and increased cholesterol efflux."" statement: ""HDL binds LCAT."" Is the statement implied by the sentence? If it isn't, please explain why.","The statement ""HDL binds LCAT"" is not explicitly stated in the sentence. The sentence only mentions that apoA-V-enriched HDL is associated with enhanced LCAT activity and increased cholesterol efflux. While it is possible that HDL binds LCAT, this is not directly stated in the sentence.",grounding
4,"Here is a paragraph and a statement: paragraph: ""Therefore, it remains unclear how the rates of recombination in regions such as those described in this manuscript for the NOS2 promoter compare to other regions of the genome designated as ""hot spots."" The region of the human NOS2 promoter approximately 4 kb upstream of the NOS2 transcription start site that is near the putative NOS2 promoter recombination hot spot has a high degree of homology with murine and bovine NOS2 sequences."" statement: ""NOS2 increases the amount of NOS2."" Is the statement implied by the paragraph? If it isn't, please explain why.","No, the statement is not implied by the paragraph. The paragraph discusses the recombination rates in the NOS2 promoter region and its homology with other species, but it does not mention anything about NOS2 increasing its own amount.",no_relation
5,"Here is a sentence and a statement: sentence: ""IFN I producing PDC, therefore, may switch on the cytotoxic function of bystander cells at sites of viral infection, in tumors and, upon chronic activation, in autoimmune tissue damage (;)."" statement: ""Interferon activates PDC."" Synonyms in the sentence and statement, respectively: ""IFN"" and ""Interferon"" Is the statement implied by the sentence? If it isn't, please explain why.","Yes, the statement ""Interferon activates PDC"" is implied by the sentence ""IFN I producing PDC, therefore, may switch on the cytotoxic function of bystander cells at sites of viral infection, in tumors and, upon chronic activation, in autoimmune tissue damage (;)."" The sentence suggests that IFN I (Interferon) is responsible for activating PDC, which in turn activates the cytotoxic function of bystander cells.",other
6,"Here is a sentence and a statement: sentence: ""Since ASF1 mediates eviction and deposition of histones during transcription, the interaction of VZV IE63 with ASF1 may help to regulate transcription of viral or cellular genes during lytic and/or latent infection."" statement: ""SRSF1 activates Histone."" Synonyms in the sentence and statement, respectively: ""Asf1"" and ""SRSF1"" ""Histones"" and ""Histone"" Is the statement implied by the sentence? If it isn't, please explain why.","No, the statement is not implied by the sentence. The sentence discusses the role of ASF1 in regulating transcription, while the statement discusses the activation of histones by SRSF1, which is not directly related to the topic of the sentence.",grounding
7,"Here is a sentence and a statement: sentence: ""We found that endogenous FoxQ1 protein bound to both of the FoxQ1 binding sites of the NRXN3 promoter ( xref )."" statement: ""FOXQ1 binds NRXN3."" Is the statement implied by the sentence? If it isn't, please explain why.","Yes, the statement ""FOXQ1 binds NRXN3"" is implied by the sentence ""We found that endogenous FoxQ1 protein bound to both of the FoxQ1 binding sites of the NRXN3 promoter.""",wrong_relation
8,"Here is a paragraph and a statement: paragraph: ""In our previous report, we suggested that EGF stimulates insulin secretion by activating PLD2, but the mechanism of EGF dependent PLD2 activation in this process remains unknown.In this study, we revealed that Cdk5 is a key regulator of EGF dependent insulin secretion in a pancreatic beta-cell line."" statement: ""EGF activates INS."" Synonyms in the paragraph and statement, respectively: ""Insulin"" and ""INS"" Is the statement implied by the paragraph? If it isn't, please explain why.","Yes, the statement ""EGF activates INS"" is implied by the paragraph. The paragraph states that EGF stimulates insulin secretion, which means that EGF activates the production and release of insulin (INS).",act_vs_amt
9,"Here is a sentence and a statement: sentence: ""As shown in the in vitro HAT assay, MSL1v1 Delta800aa acetylates H4 K16 but not p53 K120 (XREF_FIG)."" statement: ""KANSL1 acetylates TP53 on K120."" Synonyms in the sentence and statement, respectively: ""MSL1V1"" and ""KANSL1"" ""P53"" and ""TP53"" Is the statement implied by the sentence? If it isn't, please explain why.","No, the statement is not implied by the sentence. The sentence is discussing the acetylation of H4 K16 and p53 K120 by MSL1v1 Delta800aa, while the statement is discussing the acetylation of TP53 on K120 by KANSL1. The two sentences are discussing different proteins and different acetylation sites, so the statement is not implied by the sentence.",negative_result


Unnamed: 0,prompt,response,curation_tag
0,"Here is a sentence and a statement:\n\nsentence: ""Furthermore, upon Ang-1 stimulation p85beta co-immunoprecipitated with Tie-2 and was tyrosine phosphorylated, suggesting that p85beta directly interact with Tie-2 (d).""\n\nstatement: ""TM7SF2 activates PIK3R2.""\n\nSynonyms in the sentence and statement, respectively:\n""Ang-1"" and ""TM7SF2""\n""p85beta"" and ""PIK3R2""\n\nIs the statement implied by the sentence?\nIf it isn't, please explain why.\n","No, the statement is not implied by the sentence. The sentence discusses the interaction between Ang-1 stimulation and p85beta with Tie-2, while the statement discusses the activation of PIK3R2 by TM7SF2. There is no direct connection or mention of TM7SF2 or PIK3R2 in the sentence.",grounding
1,"Here is a sentence and a statement:\n\nsentence: ""Here we show that TNRC6A interacts with eIF4E2, a homologue of eIF4E that can bind to the cap but cannot interact with eIF4G to initiate translation, to inhibit the translation of target mRNAs.""\n\nstatement: ""EIF4G binds EIF4E2.""\n\nIs the statement implied by the sentence?\nIf it isn't, please explain why.\n","No, the statement ""EIF4G binds EIF4E2"" is not implied by the sentence. The sentence only states that TNRC6A interacts with eIF4E2, and that eIF4E2 can bind to the cap but cannot interact with eIF4G to initiate translation. There is no mention of EIF4G binding to EIF4E2 in the sentence.",negative_result
2,"Here is a sentence and a statement:\n\nsentence: ""Hypoxia increased HIF-1α and iNOS expression in HaCaT cells and inhibition of HIF-1α expression decreased iNOS expression.""\n\nstatement: ""NOS2 increases the amount of NOS2.""\n\nSynonyms in the sentence and statement, respectively:\n""iNOS"" and ""NOS2""\n""iNOS"" and ""NOS2""\n\nIs the statement implied by the sentence?\nIf it isn't, please explain why.\n","No, the statement is not implied by the sentence. The sentence discusses the relationship between hypoxia, HIF-1α, and iNOS expression in HaCaT cells, while the statement simply states that NOS2 increases the amount of NOS2, without any context or explanation of how it relates to the sentence.",no_relation
3,"Here is a sentence and a statement:\n\nsentence: ""Furthermore, apoA-V-enriched HDL was associated with enhanced LCAT activity and increased cholesterol efflux.""\n\nstatement: ""HDL binds LCAT.""\n\nIs the statement implied by the sentence?\nIf it isn't, please explain why.\n","The statement ""HDL binds LCAT"" is not explicitly stated in the sentence. The sentence only mentions that apoA-V-enriched HDL is associated with enhanced LCAT activity and increased cholesterol efflux. While it is possible that HDL binds LCAT, this is not directly stated in the sentence.",grounding
4,"Here is a paragraph and a statement:\n\nparagraph: ""Therefore, it remains unclear how the rates of recombination in regions such as those described in this manuscript for the NOS2 promoter compare to other regions of the genome designated as ""hot spots."" The region of the human NOS2 promoter approximately 4 kb upstream of the NOS2 transcription start site that is near the putative NOS2 promoter recombination hot spot has a high degree of homology with murine and bovine NOS2 sequences.""\n\nstatement: ""NOS2 increases the amount of NOS2.""\n\nIs the statement implied by the paragraph?\nIf it isn't, please explain why.\n","No, the statement is not implied by the paragraph. The paragraph discusses the recombination rates in the NOS2 promoter region and its homology with other species, but it does not mention anything about NOS2 increasing its own amount.",no_relation
5,"Here is a sentence and a statement:\n\nsentence: ""IFN I producing PDC, therefore, may switch on the cytotoxic function of bystander cells at sites of viral infection, in tumors and, upon chronic activation, in autoimmune tissue damage (;).""\n\nstatement: ""Interferon activates PDC.""\n\nSynonyms in the sentence and statement, respectively:\n""IFN"" and ""Interferon""\n\nIs the statement implied by the sentence?\nIf it isn't, please explain why.\n","Yes, the statement ""Interferon activates PDC"" is implied by the sentence ""IFN I producing PDC, therefore, may switch on the cytotoxic function of bystander cells at sites of viral infection, in tumors and, upon chronic activation, in autoimmune tissue damage (;)."" The sentence suggests that IFN I (Interferon) is responsible for activating PDC, which in turn activates the cytotoxic function of bystander cells.",other
6,"Here is a sentence and a statement:\n\nsentence: ""Since ASF1 mediates eviction and deposition of histones during transcription, the interaction of VZV IE63 with ASF1 may help to regulate transcription of viral or cellular genes during lytic and/or latent infection.""\n\nstatement: ""SRSF1 activates Histone.""\n\nSynonyms in the sentence and statement, respectively:\n""Asf1"" and ""SRSF1""\n""Histones"" and ""Histone""\n\nIs the statement implied by the sentence?\nIf it isn't, please explain why.\n","No, the statement is not implied by the sentence. The sentence discusses the role of ASF1 in regulating transcription, while the statement discusses the activation of histones by SRSF1, which is not directly related to the topic of the sentence.",grounding
7,"Here is a sentence and a statement:\n\nsentence: ""We found that endogenous FoxQ1 protein bound to both of the FoxQ1 binding sites of the NRXN3 promoter ( xref ).""\n\nstatement: ""FOXQ1 binds NRXN3.""\n\nIs the statement implied by the sentence?\nIf it isn't, please explain why.\n","Yes, the statement ""FOXQ1 binds NRXN3"" is implied by the sentence ""We found that endogenous FoxQ1 protein bound to both of the FoxQ1 binding sites of the NRXN3 promoter.""",wrong_relation
8,"Here is a paragraph and a statement:\n\nparagraph: ""In our previous report, we suggested that EGF stimulates insulin secretion by activating PLD2, but the mechanism of EGF dependent PLD2 activation in this process remains unknown.In this study, we revealed that Cdk5 is a key regulator of EGF dependent insulin secretion in a pancreatic beta-cell line.""\n\nstatement: ""EGF activates INS.""\n\nSynonyms in the paragraph and statement, respectively:\n""Insulin"" and ""INS""\n\nIs the statement implied by the paragraph?\nIf it isn't, please explain why.\n","Yes, the statement ""EGF activates INS"" is implied by the paragraph. The paragraph states that EGF stimulates insulin secretion, which means that EGF activates the production and release of insulin (INS).",act_vs_amt
9,"Here is a sentence and a statement:\n\nsentence: ""As shown in the in vitro HAT assay, MSL1v1 Delta800aa acetylates H4 K16 but not p53 K120 (XREF_FIG).""\n\nstatement: ""KANSL1 acetylates TP53 on K120.""\n\nSynonyms in the sentence and statement, respectively:\n""MSL1V1"" and ""KANSL1""\n""P53"" and ""TP53""\n\nIs the statement implied by the sentence?\nIf it isn't, please explain why.\n","No, the statement is not implied by the sentence. The sentence is discussing the acetylation of H4 K16 and p53 K120 by MSL1v1 Delta800aa, while the statement is discussing the acetylation of TP53 on K120 by KANSL1. The two sentences are discussing different proteins and different acetylation sites, so the statement is not implied by the sentence.",negative_result
