# Overview of current sys arch and capabilities

#### Problem statement: Given a set of causal and correlational relationships (e.g., “IQ causes SAT scores”, “Age is associated with IQ”), infer possible statistical models for either explaining or predicting these relationships.

In [1]:
import sys
!{sys.executable} -m pip install networkx

import tisane as ts



## Tisane API + Functionality
The API allows end-users to specify their conceptual domain knowledge at a high-level. 

Note: We can change the API to be more functional or declarative. (e.g., intelligence.cause(test_score, analysis)).

In [2]:
# Start analysis by specifying a TASK
# TASK can be: explanation or prediction
analysis = ts.Tisane(task='explanation')

In [3]:
# Concept declarations
test_score = ts.Concept("Test Score")
intelligence = ts.Concept("Intelligence")
tutoring = ts.Concept("Tutoring")

# Relationship declarations
analysis.addRelationship(intelligence, test_score, "cause")
analysis.addRelationship(tutoring, test_score, "cause")
analysis.addRelationship(intelligence, tutoring, "correlate")

In [4]:
# This is the graph that is constructed above: 
print(analysis.getGraph())

Nodes: ['Intelligence', 'Test Score', 'Tutoring'] has 3 concepts. Edges: [('Intelligence', 'Test Score'), ('Intelligence', 'Tutoring'), ('Tutoring', 'Test Score')] has 3 relationships.


### Effects sets generation
Given concept and relationship declarations, generate possible sets of effects for explaining the given concept. 

TODO: 
- provide API verbs for specifying study design and use study design info to generate effects (esp. interaction and mixed effects)

In [5]:
# Generate set of main and interaction effects
print(analysis.generate_effects_sets(test_score))
# analysis.pretty_print_generate_effects_sets(test_score) # useful for debugging, buggy atm

{frozenset({InteractionEffect(effect=()), MainEffect(effect=('Intelligence',))}), frozenset({MainEffect(effect=('Tutoring',)), InteractionEffect(effect=(('Intelligence', 'Tutoring'),))}), frozenset({MainEffect(effect=('Tutoring',)), InteractionEffect(effect=())}), frozenset({InteractionEffect(effect=(('Intelligence', 'Tutoring'),)), MainEffect(effect=())}), frozenset({InteractionEffect(effect=(('Intelligence', 'Tutoring'),)), MainEffect(effect=('Intelligence',))}), frozenset({MainEffect(effect=())}), frozenset({InteractionEffect(effect=(('Intelligence', 'Tutoring'),)), MainEffect(effect=('Tutoring', 'Intelligence'))}), frozenset({InteractionEffect(effect=()), MainEffect(effect=('Tutoring', 'Intelligence'))})}


### Statistical model generation 
Given a set of conceptual specifications (i.e., concepts and relationships), generate possible statistical models (e.g., linear regression, logistic regression, etc.) for explanation/prediction 

Note: Does not necessarily require a collected dataset if an end-user can specify assertions. 

TODO: 
- provide some API calls to specify assertions about data properties that are used as constraints in solving for canddiate statistical models. 

In [6]:
# Specify data types
test_score.specifyData(dtype="numeric") # Score 0 - 100 
intelligence.specifyData(dtype="numeric") # IQ score 
tutoring.specifyData(dtype="nominal", categories=["afterschool", "none"])

# If had a dataset, could also
# test_score.addData(data=df['score'])

In [8]:
# Explain or Predict
analysis.explain(dv=test_score)
# """" In progress (all rely on .explain())
# analysis.explainWith(dv=test_score, ivs_to_include=[tutoring])
# analysis.explainWithOnly(dv=test_score, ivs_to_include=[tutoring])
# analysis.explainWithout(dv=test_score, ivs_to_include=[tutoring])
# """"

# """" Not yet implemented (all rely on .predict())
# analysis.predict(dv=test_score)
# analysis.predictWith(dv=test_score, ivs_to_include=[tutoring])
# analysis.predictWithOnly(dv=test_score, ivs_to_include=[tutoring])
# analysis.predictWithout(dv=test_score, ivs_to_include=[tutoring])
# """"

<tisane.smt.results.AllStatisticalResults at 0x1125e55c0>

In [14]:
# Get a set of valid statistical models back
# STILL IN PROGRESS
from tisane.smt.knowledge_base import KnowledgeBase, find_statistical_models
ivs = [intelligence, tutoring]
dvs = [test_score]
valid_models = find_statistical_models(ivs=ivs, dvs=dvs)
print(valid_models)
# IVs can have incoming edges --> end-user can specify IVs, DV

['Linear Regression', 'Logistic Regression']


### TODO: Statistical model construction, execution

Given sets of effects and candidate statisitcal models to implement, construct and execute the statistical models using each set of effects. 

Output: R scripts? 