## Reasoning Framework
Develop a reasoning framework that can be used to fulfill use cases.  
General structure:  
1. For patients with a given problem, what other problems, procedures, and other events do they have greater odds of experiencing? (see [A&P_drafter.ipynb](A&P_drafter.ipynb))
2. Identify patient goals and aversions regarding the things they are likely to experience.
3. Identify the most likely paths between the patient and the outcomes they desire or desire to avoid
4. Identify the interventions most likely to put the patient onto their desired paths and block the paths they wish to avoid  
If time permits:  
5. Present plans to the patient and update aversion scores based on their acceptance/rejection. 
6. Repeat steps 2-5 with the updated aversion scores until the most acceptable plan is reached.   
Else:  
5. Act. If possible, get the patient's feedback afterwards to update aversion scores for future patients' benefit.

### Identify patient goals and aversions

From the patient's current condition, identify what problems are currently active and which are likely to develop, based on causal chains in the knowledge graph and weighted associations in patient data. In discussion with the patient, identify which outcomes the patient wishes to avoid. Store the patient's sentiment with regard to certain outcomes to provide an initial guess for other patients' priorities. The sentiment scores range from 0 (most averse) to 1 (least averse), and are calculated by dividing the item's rank by the length of the list. For example, a patient might rank in order of aversion: 
1. stroke (aversion score 1/3 ~ 0.33)
2. taking a pill every day (aversion score 2/3 ~ 0.67)
3. hypertension (aversion score 3/3 = 1)

Now suppose the same patient was only asked to rank 2 really bad outcomes, like stroke and being intubated for respiratory support. Their list might look like:
1. Stroke (sentiment score 1/2 = 0.5)
2. Intubation (sentiment score 1/1 = 1)

Now they have one instance where they gave hypertension a score of 1 and another instance where they gave intubation a score of 1. The patient probably does not feel the same aversion to hypertension as they do to the prospect of being intubated. How to faithfully represent relative values by absolute values?
Perhaps we simply create a relationship between each entity and every other entity, which stores as properties the proportion of times thing A has been preferred over thing B and the number of times they have been compared. Given any group of entities, you could rank them in order of preference. It might look something like this:  
![Patient preferences figure](Patient_preferences_tracking.png)

But how would you compare two things which have never been experienced by the same person, like childbirth vs vasectomy? You could infer it by looking at the other things that have been experienced by people who either had childbirth or vasectomy. For example:  

Kaitlyn's list from most to least averse:  
1. Childbirth
2. Kidney stones

Thomas' list from most to least averse:  
1. Kidney stones
2. Vasectomy

We could infer that for the average person, a reasonable ranking might be:
1. Childbirth
2. Kidney stones
3. Vasectomy

![Inferred_pt_preferences](Inferred_pt_preferences.png)

We could rank the entire list of entities, then divide each entity's rank by the length of the entire list to give each entitiy a sentiment score from 0 to 1.  

#### Individualizing Preference Predictions

Of course, for some people childbirth is actually a strongly desired goal more than an adversity, so we still need a way to cluster patients together in terms of preferences. For example, religious women ages 18-34 would be much more likely to desire childbirth than women outside that demographic.

If everyone ranks two entities the same, it's easy to say that one is worse than the other. For example, almost everyone would be more averse to cancer than a single enema. For cases where people show a high degree of variation in preference, it could be that no strong preference exists, or that about half of people actually do prefer one choice while the other half of people prefer the other choice. To place people in groups that may predict their preferences, we need to observe where their preferences run counter to the majority. For example, people who would prefer to have kidney stones rather than a vasectomy likely have a strong desire to have children, which could tell us a lot about other choices they might make. People who would rather have the vasectomy are probably in a different place in life and would likely value other things differently, too.

So we can cluster patients based on how they rank choices compared to the way most other people have ranked choices. A cluster of patients would have a loosely defined set of variations from the norm. For example, men who want to have kids would have a very different absolute rank for vasectomy compared to men who don't want to have kids. Outside of this set of preferences, we could just use the population average prefences to predict the values of men in either group. We will have to be careful to check and update patient preferences, since they can change over time. 


These sentiment scores can be stored in relationships between the patient and the outcome. Patients who rank things similarly can be clustered together to more accurately predict their preferences on additional outcomes. For example, patients in hospice will generally express stronger aversion for discomfort and pain than they would for painless problems with long-term consequences like hypertension and hyperglycemia. It would make sense to use other hospice patients to make a first guess at what the current hospice patient would want. 

These sentiment scores could be collected during encounters with healthcare providers, through questionnaires, or gamified in a "Would you rather..." style game that would give you a random selection of 3-5 choices. There is a risk that malicious people may submit false preferences, which could significantly damage the system and potentially cause harm. 

### Identify the most likely paths between the patient and the outcomes they desire or desire to avoid
We could use a path-finding algorithm that traverses the knowledge graph and the virtualized patient population.
  
![](Reasoning_pathfinding_AF_example.jpg) 
  
![](Reasoning_pathfinding_pathologic_path.jpg)  

### Identify the interventions most likely to put the patient onto their desired paths and block the paths they wish to avoid 
![](Reasoning_pathfinding_recommendations.jpg)  

Relationships that should be included are:

In [9]:
relevant_paths = [ 
    'BIOLOGICAL_PROCESS_HAS_RESULT_BIOLOGICAL_PROCESS', 
    'BIOLOGICAL_PROCESS_HAS_RESULT_CHEMICAL_OR_DRUG',
    'CAUSE_OF',
    'CHEMICAL_OR_DRUG_HAS_MECHANISM_OF_ACTION',
    'CHEMICAL_OR_DRUG_HAS_PHYSIOLOGIC_EFFECT',
    'CHEMICAL_OR_DRUG_INITIATES_BIOLOGICAL_PROCESS',
    'DISEASE_HAS_ACCEPTED_TREATMENT_WITH_REGIMEN',
    'DISEASE_HAS_FINDING',
    'DISEASE_MAY_HAVE_CYTOGENETIC_ABNORMALITY',
    'DISEASE_MAY_HAVE_FINDING',
    'HAS_ASSOCIATED_ETIOLOGIC_FINDING',
    'HAS_CAUSATIVE_AGENT',
    'HAS_PATHOLOGICAL_PROCESS',
    'HAS_PHYSIOLOGIC_EFFECT', 
    'HAS_PROCESS_OUTPUT',
    'HAS_RISK_FACTOR',
    'INDUCES',
    'MTH_HAS_BRITISH_FORM',
    'MTH_HAS_EXPANDED_FORM',
    'NEGATIVELY_REGULATES',
    'PATHOGENESIS_OF_DISEASE_INVOLVES_GENE',
    'POSITIVELY_REGULATES',
    'PRIMARY_MAPPED_TO',
    'PROCESS_INITIATES_BIOLOGICAL_PROCESS',
    'PROCESS_INVOLVES_GENE',
    'REFERS_TO',
    'REGULATES',
    'REPLACES',
    'SYNONYM',
]
cypher = ''
for item in relevant_paths:
    cypher = cypher + item + '|'
print(cypher)

BIOLOGICAL_PROCESS_HAS_RESULT_BIOLOGICAL_PROCESS|BIOLOGICAL_PROCESS_HAS_RESULT_CHEMICAL_OR_DRUG|CAUSE_OF|CHEMICAL_OR_DRUG_HAS_MECHANISM_OF_ACTION|CHEMICAL_OR_DRUG_HAS_PHYSIOLOGIC_EFFECT|CHEMICAL_OR_DRUG_INITIATES_BIOLOGICAL_PROCESS|DISEASE_HAS_ACCEPTED_TREATMENT_WITH_REGIMEN|DISEASE_HAS_FINDING|DISEASE_MAY_HAVE_CYTOGENETIC_ABNORMALITY|DISEASE_MAY_HAVE_FINDING|HAS_ASSOCIATED_ETIOLOGIC_FINDING|HAS_CAUSATIVE_AGENT|HAS_PATHOLOGICAL_PROCESS|HAS_PHYSIOLOGIC_EFFECT|HAS_PROCESS_OUTPUT|HAS_RISK_FACTOR|INDUCES|MTH_HAS_BRITISH_FORM|MTH_HAS_EXPANDED_FORM|NEGATIVELY_REGULATES|PATHOGENESIS_OF_DISEASE_INVOLVES_GENE|POSITIVELY_REGULATES|PRIMARY_MAPPED_TO|PROCESS_INITIATES_BIOLOGICAL_PROCESS|PROCESS_INVOLVES_GENE|REFERS_TO|REGULATES|REPLACES|SYNONYM|


In [None]:
# Consider adding these for the recommendations
[
    'CHEMICAL_OR_DRUG_HAS_MECHANISM_OF_ACTION',
    'DISEASE_HAS_ACCEPTED_TREATMENT_WITH_REGIMEN',
    'HAS_CONTRAINDICATED_DRUG',
    'HAS_CONTRAINDICATED_CLASS',
    'HAS_CONTRAINDICATED_MECHANISM_OF_ACTION',
    'HAS_CONTRAINDICATED_PHYSIOLOGIC_EFFECT',
    'HAS_EXPANDED_FORM',
    'HAS_MANIFESTATION',
    'HAS_MECHANISM_OF_ACTION',
    'MAY_DIAGNOSE',
    'MAY_TREAT', 
    'MAY_PREVENT', 
]

In [None]:
# Consider using this to find patients with syndromes
[
    'HAS_DEFINING_CHARACTERISTIC',
    'HAS_DEFINITIONAL_MANIFESTATION',
    'HAS_MANIFESTATION'
]

In [None]:
# Find any path between Ischemic Stroke and Atrial Fibrillation
query = '''
MATCH path = (stroke:Concept {cui: 'C0948008'})-[*..4]-(af:Concept {cui: 'C0004238'})
RETURN path
LIMIT 20'''

query = '''
MATCH
  (stroke:Concept {cui: 'C0948008'}),
  (af:Concept {cui: 'C0004238'}),
  p = shortestPath((stroke)-[*]->(af))
WHERE none(r IN relationships(p) WHERE type(r) IN ['DISEASE_HAS_ASSOCIATED_ANATOMIC_SITE', 'HAS_CLINICAL_COURSE', 'HAS_FINDING_SITE', 'MAY_TREAT', 'MAY_PREVENT', 'HAS_INHERITANCE_TYPE', 'HAS_MEMBER', 'RELATED_TO', 'HAS_CONTRAINDICATED_DRUG', 'HAS_SUBJECT_RELATIONSHIP_CONTEXT', 'HAS_FINDING_CONTEXT','NICHD_PARENT_OF', 'HAS_NICHD_PARENT', 'HAS_ANSWER', 'ANSWER_TO'])
RETURN p'''