In [1]:
import sys
sys.path.append("../src")
import pandas as pd
import pickle
from utils import solve_eq_string, is_number
from text_to_template import number_parsing
pd.set_option('max_colwidth', 800)

%load_ext autoreload
%autoreload 2

### Task details and expectations:  

**Input**  - math problem in free text  
**Output** - numerical solution  
**Helpers** - 
#### Text template generation module
Takes specific problem and factors out the numbers. The new representation generalizes the original problem
#### Symbolic solver module 
Gets symbolic equation and generates the numerical solution

# Load data

In [2]:
math_train = pd.read_json(r"../data/dev_data.json")
math_train[['ans', 'ans_simple', 'equations','text']].head()

Unnamed: 0,ans,ans_simple,equations,text
0,{40;29},"[40, 29]","[unkn: x,y, equ: x=y+11, equ: 3*x=4*y+4]",one number is 11 more than another number. Find the two numbers if three times the larger exceeds four times the smaller number by 4.
1,6; 9,"[6, 9]","[unkn: x,y, equ: x + 3 = y, equ: 2*y + 12 = 5*x]",One number is 3 less than a second number. Twice the second number is 12 less than 5 times the first. Find the two numbers.
2,{34; 28},"[34, 28]","[unkn: x,y, equ: x+y=62, equ: x-y=6]",Find two numbers whose sum is 62 and whose difference is 6.
3,{42;26},"[42, 26]","[unkn: x,y, equ: x+y=68, equ: x-y=16]",the sum of two numbers is 68. their difference is 16. what are the numbers?
4,{77; 20},"[77, 20]","[unkn: x,y, equ: x+y=97, equ: x-y=57]",the sum of two numbers is 97. the difference of the two numbers is 57. find the two numbers


Example for verbal math problem

In [3]:
sample = math_train.iloc[1]

In [4]:
sample['text']

'One number is 3 less than a second number. Twice the second number is 12 less than 5 times the first. Find the two numbers.'

The Requested equation and its solution

In [5]:
sample[['equations','ans']]

equations    [unkn: x,y, equ: x + 3 = y, equ: 2*y + 12 = 5*x]
ans                                                      6; 9
Name: 1, dtype: object

# Mapping free text to template
You get the equations and the text and you are supposed to (see code in `number parsing`):
+ Code in `list_number_mapper`
    + Extract the numbers in the equations
    + Create template equations from math equations
+ Code in `number_mapper`
    + Extract the numbers from text
    + Create text template - replaces numbers with symbolics
    + Create list of symbols that correspond to the extracted numbers
+ Code in `generate_new_equation`
    + Reformats the equations - Replaces the symbolics in the template equations with numbers

In [6]:
equation_list_template, eq_num_list, text_template, var_list, text_num_list =\
                number_parsing(sample['equations'], sample['text'])

In [7]:
print(f"** Original text **:\t{sample['text']}\n"+
    f"\n** Equation list:\t{sample['equations']}\n"+
    f"\n** Numbers in text **:\t{text_num_list}\n\n"+
    f"{'_'*30}--OUTPUT---{'_'*30}\n"+  
    f"\n** Template text **:\t{text_template}\n"+
    f"\n** Equation list template **:\t{equation_list_template}\n"+
    f"\n** Numbers from the equation **:\t{eq_num_list}\n"+
    f"\n** Symbol list **:\t{var_list}")

** Original text **:	One number is 3 less than a second number. Twice the second number is 12 less than 5 times the first. Find the two numbers.

** Equation list:	['unkn: x,y', 'equ: x + 3 = y', 'equ: 2*y + 12 = 5*x']

** Numbers in text **:	['3', '2', '12', '5']

______________________________--OUTPUT---______________________________

** Template text **:	one number is $n0 less than a second number $n1 the second number is $n2 less than $n3 times the first find the two numbers 

** Equation list template **:	['unkn: x,y', 'equ: x + $n0 = y ', 'equ: $n1*y + $n2 = $n3*x ']

** Numbers from the equation **:	['3', '2', '12', '5']

** Symbol list **:	['$n0', '$n1', '$n2', '$n3']


# Solving symbolic equations 
`solve_eq_string` gets as input the equations in the same format as the raw data. Another flag is passed to constrain the problem for integer solution.
The returned value is the solution in `float` format.  
The function uses `sympy` and `wolfram` engines

In [8]:
sample["equations"]

['unkn: x,y', 'equ: x + 3 = y', 'equ: 2*y + 12 = 5*x']

In [9]:
answer = sample['ans']
solution = solve_eq_string(sample["equations"], integer_flag= is_number(sample["text"]))
print(f"Answer:\t {answer}")
print(f"Solution:\t {solution}")

Answer:	 6; 9
Solution:	 [6.00000000000000, 9.00000000000000]


# What do you need to do?
Implement a model (a class that like `ExampleModel` class). Use the `score` function to score your model.
You can test your solution with the evaluation script. It should look something like this

In [10]:
from models.models import ExampleModel

math_train = pd.read_json(r"../data/dev_data.json")
math_test = pd.read_json(r"../data/test_data.json")

## model
model = ExampleModel()
model.fit(math_train)

## print evaluation result
print(f'result score on train: {model.score(math_train,frac=0.1,verbose=False,use_ans=True)}')
print(f'result score on test: {model.score(math_test,frac=1,verbose=True,use_ans=True)}')

NotImplementedError: 

Results will be printed in the following manner

```
result score on train: 0.861315

0 1 0.14285714285714285
0 2 0.16666666666666666
0 3 0.15384615384615385
0 4 0.14285714285714285
0 5 0.13333333333333333
...
```

First line is the accuracy on the training data. From there we iterate over the rows of the test data and print the accuracy in the current iteration