# Linear Regression Module Instructions


Save the file 'linear_regression.py' in the folder that you will be working in.

Create a python file with the following import statements

In [1]:
import linear_regression as lr 
%matplotlib notebook  

Next, you will need to create a dataset object. If you are studying the relationship between the variables x and y, the best fit linear equation is of the form y = mx + c.

The example I will use throughout this tutorial is the set of data points (x,y) = (1.0,3.0),(2.0,6.0),(3.0,6.1),(4.0,9.3),(5.0,8.5)

##  Importing Your Data and Creating A Dataset Object

### To manually add your data, choose one of the following options.
- Create two lists: one containing your data set's x coordinates and one containing your data set's y coordinates. Then, create your data set object with the parameters x and y:

```python 
x = [1.0,2.0,3.0,4.0]
y = [3.0,6.0,6.1,9.3]

dataset = lr.dataset(x,y)
```




- Create a list of tuples. Each tuple is one of your data points. Then, create the dataset with your list being the only parameter.

```python
data_points = [(1.0,3.0),(2.0,6.0),(3.0,6.1),(4.0,9.3)]

dataset = lr.dataset(data_points)
```


### Importing Data From A CSV Files

- If your data is contained in the csv file *data1.csv*:

```csv
1.0,3.0
2.0,6.0
3.0,6.1
4.0,9.2
```

You can import your data,using the lr.import_csv function, and create your dataset object as follows:

```python
x,y = lr.import_csv('data1.csv')

dataset = lr.dataset(x,y)
```




In [2]:
## Dataset Object Attributes and Methods

In [3]:
x = [1.0,2.0,3.0,4.0]
y = [3.0,6.0,6.1,9.3]
dataset = lr.dataset(x,y)

## Printing Linear Regression Results

You can print the simple linear regression results  for your set of measurements, using your dataset object's *results* function, as follows:


In [4]:
dataset.results()

Unnamed: 0,Slope,Intercept,"Std. Error, Slope","Std. Error, Intercept",r
,1.9,1.35,0.425441,1.165118,0.953343


If the model you are testing is of the form $y = mx$, with m being a constant, and thus you want the best fit line to intercept the origin, you can print the results for best fine line through the origin as follows:

In [5]:
dataset.results(through_origin = True)

Unnamed: 0,Slope,Intercept,"Std. Error, Slope",r
,2.35,0.0,0.990994,0.183333


To print both the simple linear regression results and the results for the best fit line through the origin, you don't need to use the *results* method twice. You can do it all at once using the optional parameter *with_through_origin*.

In [6]:
dataset.results(with_through_origin = True)

Unnamed: 0,Slope,Intercept,"Std. Error, Slope","Std. Error, Intercept",r
Regular Linear Regression,1.9,1.35,0.425441,1.16512,0.953343
"Regression Line Through (0,0)",2.35,0.0,0.183333,---,0.990994


## Displaying Regression Graph

To display a regression graph, use your dataset object's *graph* method.

In [7]:
dataset.graph()

<IPython.core.display.Javascript object>

To graph the best-fit line through the point (0,0), set the optional parameter *through_origin* equal to *True*.

In [8]:
dataset.graph(through_origin = True)

<IPython.core.display.Javascript object>

To include a title, use the *graph* method's optional parameter 'title'.
Example:

In [9]:
dataset.graph(title = 'Data Set Results')

<IPython.core.display.Javascript object>

If the x-coordinates of your data points are measurements of time, in seconds, and your coordinates are measurements of distanced travelled, in meters....

Your dataset has the attributes *xlabel*, *ylabel*, *xunits*, and *yunits.



In [10]:
dataset.xlabel = 't' #changes this data label from the default value 'x' to 't' (which stands for time)
dataset.ylabel = 'd' #changes this data label from the default value 'y' to 'd' (which stands for distance)
dataset.xunits = 's' #s for 'seconds'
dataset.yunits = 'm' #m for meters

dataset.graph(title = 'Distance vs Time')

<IPython.core.display.Javascript object>

## More Examples:

In [11]:
B = [0.001370,0.001507,0.001644,0.001781,0.001918,0.002055,0.002192,0.002329,0.002466,0.002603]
w = [0.419718,.445616,.470651,.527999,.560999,.622098,0.615999,.641141,.766242,.826735]

dataset1 = lr.dataset(B,w)

dataset1.xlabel = 'B'
dataset1.ylabel ='$\\omega$'

dataset1.xunits = 'T' #T for tesla
dataset1.yunits = '1/s'

dataset1.results()

Unnamed: 0,Slope,Intercept,"Std. Error, Slope","Std. Error, Intercept",r
,313.429949,-0.032909,25.541447,0.051724,0.974451


In [12]:
dataset1.graph(title = '$\\omega$ vs B')

<IPython.core.display.Javascript object>

In [13]:
data = [(0.001370,.416657),(0.001507,0.445616),(0.002740,0.597829),(0.003425,1.041988),
        (0.004110,1.261684),(0.004785,1.532484)]
dataset2 = lr.dataset(data)

In [14]:
dataset2.results(with_through_origin = True)

Unnamed: 0,Slope,Intercept,"Std. Error, Slope","Std. Error, Intercept",r
Regular Linear Regression,328.158603,-0.09832,36.888766,0.119699,0.975647
"Regression Line Through (0,0)",300.242681,0.0,13.869425,---,0.994708


In [15]:
dataset2.xlabel = 'B'
dataset2.xunits = 'T'
dataset2.ylabel = '$\\omega$'
dataset2.yunits = '1/s'
dataset2.graph(through_origin = 'True',title = '$\\omega $ vs B')

<IPython.core.display.Javascript object>

*data_example.csv*

```csv
0.003809,0.0610
0.003562,0.0540
0.004658,0.0790
0.004247,0.0690
0.003425,0.0530
0.005069,0.0945
0.004521,0.087
0.005206,.1070
0.003562,0.0570
```

In [16]:
x,y = lr.import_csv('data_example.csv')
print('x = ', x)
print('y = ', y)

x =  [ 0.003809  0.003562  0.004658  0.004247  0.003425  0.005069  0.004521
  0.005206  0.003562]
y =  [ 0.061   0.054   0.079   0.069   0.053   0.0945  0.087   0.107   0.057 ]


In [17]:
dataset2 = lr.dataset(x,y)
dataset2.xlabel = 'B'
dataset2.ylabel = 'R'
dataset2.xunits = 'T'
dataset2.yunits = 'm'

In [18]:
dataset2.results()

Unnamed: 0,Slope,Intercept,"Std. Error, Slope","Std. Error, Intercept",r
,28.095586,-0.04531,2.424444,0.010368,0.974914


In [19]:
dataset2.graph(title = 'Set Results')

<IPython.core.display.Javascript object>

In [21]:
measurements = [[0.0007,19.65E7],[0.00075,20.74E7],
                [0.0008,21.86E7],[0.000825,22.68E7],
                [0.000875,23.8E7],[0.0009,24.68E7],
                [0.00095,26.05E7],[0.00099,27.4E7],
                [0.001010,28.00E7],[0.001050,29.0E7]]
dataset3 = lr.dataset(measurements)

In [22]:
dataset3.xlabel = 'B'
dataset3.ylabel = 'F'
dataset3.xunits = 'B'
dataset3.yunits = 'Hz'

dataset3.results(through_origin = True)

Unnamed: 0,Slope,Intercept,"Std. Error, Slope",r
,275522900000.0,0.0,0.999971,696673500.0


In [23]:
dataset3.graph(through_origin = True, title = 'F vs B')

<IPython.core.display.Javascript object>

In [25]:
B = [0.000576,0.000648,0.000696,0.000744,0.000816,0.000864,0.000912]
F = [19.94,21.04,22.12,23.2,24.41,26.5,28.06]

dataset4 = lr.dataset(B,F)

dataset4.xlabel = 'B'
dataset4.ylabel = 'F'
dataset4.xunits = 'T'
dataset4.yunits = 'MHz'

dataset4.results(with_through_origin = True)

Unnamed: 0,Slope,Intercept,"Std. Error, Slope","Std. Error, Intercept",r
Regular Linear Regression,23924.921384,5.645802,1746.376002,1.32569,0.98694
"Regression Line Through (0,0)",31281.474606,0.0,504.287477,---,0.999221


In [26]:
dataset4.graph('F vs B')

<IPython.core.display.Javascript object>

In [27]:
data = [(.7,19.93),(.738,21.05),(.775,22.04),(.825,23.1),(0.863,24.),
        (.887,25.06),(.935,26.04),(.975,27.05),(1.012,28.02),(1.043,29.06)]

dataset5 = lr.dataset(data)

In [41]:
dataset5.results()

Unnamed: 0,Slope,Intercept,"Std. Error, Slope","Std. Error, Intercept",r
,26.037536,1.744345,0.420109,0.370656,0.99896


In [39]:
dataset5.xlabel = 'B'
dataset5.xunits = 'mT'
dataset5.ylabel = 'F'
dataset5.yunits = 'MHz'

In [40]:
dataset5.graph(title = 'Last Set')

<IPython.core.display.Javascript object>