In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Let's make some data! 

---------

We call this "fake data" because it doesn't represent anything (we didn't get the numbers from a database).  We can use this fake data to practice learning how to find and fit transiting exoplanet lightcurves.

In [None]:
np.random.seed(99) # this just makes sure our random data doesn't 
                    # change each time we re-run the next cell

# we'll start with a generated array
x = np.arange(-20,20) # array goes from -20 to 19 (20-1)
print('This is the length of this array:',len(x))
print(x)

To make a plot, we need both x and y values.  Let's make the y values now:

In [None]:
y = x**2 + np.random.normal(0,40,len(x)) 

# make fake error bars
yerror = np.random.normal(10,30,len(x)) 

### Let's plot our fake data!

In [None]:
plt.figure(figsize=(9,6))
plt.errorbar(x,y,yerror,fmt='o')
#plt.scatter(x,y) # points with no error lines

# plot some lines
line1 = x**2
line2 = x**2 + 10*x
plt.plot(x,line1)
plt.plot(x,line2)

plt.show()

It looks like the orange line follows the blue points better.  We can pretend like this was done on purpose (like we "fit" the blue points and calculated the orange line). 

One way to tell how good our fit is to do a test called a "Chi-Squared Test".  The basic goal of this test to to get the smallest number possible.

### Equation for Chi-Squared

$$\sum{\frac{(y-fit)^2}{yerror^2}}$$

Let's see how to do this with Python.

In [None]:
chi_1 = ((y-line1)**2 / yerror**2).sum()
chi_2 = ((y-line2)**2 / yerror**2).sum()
print ('Chi-squared for orange line: %f' % chi_1)
print ('Chi-squared for green line: %f' % chi_2)

Notice that the orange line has the smaller number!  This is good, because we could tell by eye that the orange line was doing a better job following the blue points.  Now we have that chi-squared statistic to back our statement up!

# Want more practice making fake data?
Great!  The cells below can be used for you to explore more -- feel free to make new cells, more plots, and challenge yourself to make something complex!

To start you off, here's another example, with all of the variable definitions in the same cell as the plotting code.  **Play with changing up the values and what you're plotting!**

In [None]:
# defining the variables
x = np.linspace(-10,10,50) # this is fancy way to get a range of numbers, but defining how many points you want (in this case, 50)
y = np.sin(x) + np.random.normal(0, 0.3, len(x))
yerror = np.random.normal(0, 0.5, len(x))

# plotting!
plt.figure(figsize=(10,5))
plt.errorbar(x,y,yerror,fmt='o')

# making the lines to "fit" the scatter points
y_1 = np.sin(x)
y_2 = np.sin(x)**2
plt.plot(x,y_1)
plt.plot(x,y_2)

plt.show()

In case it's helpful, here are some functions you can use in your y equations (including some that the numpy package has):


```
x + 10
x**2 + 10*x + 10
np.sin(x)
np.cos(x)
np.tan(x)
np.sqrt(x)
x**4
np.power(x,4) # does the same thing as the previous line
...
```



### Now explore and play with this as much as you want!

---



---



In [None]:
# space to put more code (also can add cells by using the "+ Code" button above!)



