# Transient Identification

Transients are events that happen that cause a change in an object's brightness due to some phenomenon. We can get clues about the source and its mechanism from a transient's light curve. A light curve is a plot that shows how brightness changes over time. 


You and three other astronomers (your group mates) all put in proposals for telescope time to get the light curves from 4 different transients: a supernova, a Cepheid variable star, an active black hole (an AGN), and one of unknown origin. You survived the telescope proposal process and they all get approved! You wait patiently for your data to get collected and sent to you via e-mail. 

You get a long e-mail from the telescope facility with "We're sorry :(" as the subject line. 
The e-mail reads:

```
Good afternoon, Dr. Astronomer. 

Thanks for using our telescope for your scientific needs! We value you and the science you create. 
We regret to inform you that there was an issue in our system and we mixed up your data with the data of three other astronomers, Dr. Astronomer, Dr. Astronomer, and Dr. Astronomer. This typically wouldn't be a problem except for the fact that there was an error in the pipeline and we did not save the data using a helpful filename. The good news is that it's obvious that you're all in different fields of Astronomy, so the objects should look distinct. We don't feel like figuring it out, so we're sending a random file to you and the other files to the other Dr. Astronomers. It's up to you all to get the data to the right astronomer. See below for the data. 

We look forward to future proposals!

Clear skies, 
Telescope Facility

P.S. The pipeline did only an ok job at cleaning up the data, so you have to finish that too.

```
    
What the heck! Not only do you have to do extra data clean-up, you have to clean up someone else's data! You reach out to Dr. Astronomer, Dr. Astronomer, and Dr. Astronomer to set up a meeting to figure out who has what data. You all sit together at 1203 W Nevada St. to figure out the mystery. 

Now you all have to go through the process of IDing which transient you have from the file sent to you (the number we gave you). 

In [1]:
# importing necessary modules
import matplotlib.pyplot as plt     # for plotting 
import numpy as np                  # for saving data

## Read in the data from your file

Now you want to plot how the brightness changes over time to get a better idea of what object or event you're looking at. First, you need to read in the data that was sent to you!

These files are csv files (comma-separated files). The delimiter in the file is a comma. A delimiter is a character that separates information in a file. The columns in this file are separated by commas and that's how you know what information pertains to one column over another. Rows are separated by a newline. A newline is represented by the character, `\n`, and looks like hitting the `enter` key on your keyboard between lines. Some programs understand what the newline is and won't show it to the user, others will show it to the user.

We'll be using `numpy` functions to read in the data. [Click here](https://numpy.org/devdocs/user/how-to-io.html) for more information on reading and writing to files using `numpy`.

You expect the file to be formatted in two columns. One column has the times and the other column has the brightness observed at that time. 
Right now, you don't know which column has what information. 
The first line in the file is a header line. A header line has information about each column in the data, like the quantity and what unit its in. 

Read in the data from the files into 2 arrays: one for the time and one for the brightness.
Be careful to not save the header line into these arrays. Make sure you skip over it when you read in your data into the arrays.

1. Open the file 
2. Print the header line for information about the columns
3. Read only the data into two arrays: `time` and `data`
4. Check the information you have saved in your arrays by printing them out

In [2]:
filename =  # name of your file here in quotation marks ("filename")
# either transient_1.csv, transient_2.csv, transient_3.csv, or transient_4.csv

# opening your file to read only mode
with open(filename, 'r') as file:

    first_line = file.readline().strip('\n')    # reads in the first line in the file and stores it in `first_line`
    # `.strip('\n')` removes the newline character from the end of the line

    # print the first line of the file to see the header line
    print()


    # use np.loadtext to read in the data from the file
    # make sure to include that the delimiter is a comma and to skip the first row in the call
    # (in Day 1 Jupyter Notebook)
    # hint: use `delimiter =` and `skiprows =` 
    all_info = np.loadtxt(filename, delimiter = ',', skiprows = 1) 
    

SyntaxError: invalid syntax (1641607243.py, line 1)

In [1]:
# print all_info to see the formatting

# save the first 2 columns into 2 different lists using array indexing
time = all_info[:,0]
data = all_info[:,1]

# double check that you saved the right information into the arrays by printing `time` and `data` and comparing them to the output of `all_info`

NameError: name 'all_info' is not defined

## Data Cleanup

Figure out what your dataset looks like by plotting it as a scatter plot.

**Remember that magnitudes are on an inverse scale**. Smaller magnitudes correspond to brighter points.

One object is **not** reported in magnitudes. Double check!

In [None]:
plt.scatter()
plt.show()

Saying the pipeline did an ok job is...... sure. It's not horrible, but there is definitely some more work you have to do. 

Sometimes the data you get back is formatted unexpectedly or you have extra points that aren't from the data.
Observations aren't perfect and there can be artifacts from the instrument, the sky, and other sorces that can add random information to your data. This is called noise.
Remember that the brightness in our data is relative. It could be that the observations were made with a different calibration for brightness which would lead to an offset in the data. 
These would usually get taken care of before you get ahold of the data, but not in this case. Your next step is to figure out what's wrong with your data. 

There are two things that can be wrong with the data you were given: 
1. There are outlier points that are noise and aren't a part of the data itself
2. The data is discontinuous


##### Based on the your plot above, what do you think is wrong?
You'll manipulate the data in different ways based on what you think the issue is. **There is only one problem in each dataset.**


### Problem: Extraneous Points

There are some clear outlier points in your data set that you need to get rid of. It's tough to pick a cut off point for what data to include and exclude. You want to make sure that you leave off those dramatically massive points without getting rid of actual information. It's not possible to eliminate all noise, you just want to get as much of it out as possible. 

Select a maximum -or- a minimum magnitude/ flux threshold and get rid of points that fall outside of that range. Try to get rid of as much as you can.
Since you're changing the `data` array, make sure you also change the `time` array to only include the times that have points that fall below or above that threshold.

In [None]:
brightness_limit =    # value here

# new lists that will hold the points that you want to keep in your data
truncated_data = []
truncated_time = []

# truncate the brightness data using loops and if statements (or some other method you've learned)
for i in range():
    # change 'condition 1' to the comparison operations (<, <=, etc.)
    # reminder that if you're working with the magnitude system, it's backwards!!
    if 'condition 1':
        # this index corresponds to a value that falls within the allowed range
        # add to the new lists created above
        new_list.append(original_list[])
        new_list.append(original_list[])

In [None]:
# plot your new data as a scatter plot
# if it looks like you cut off too much or too little, go back and adjust your thresholds

# plot the original data on top of the truncated data to compare the two
plt.scatter()
plt.scatter()
plt.show()

### Problem: Discontinuous Data

There is a jump in the data that doesn't seem to be related to the event itself and you want to make a continuous light curve. Go back to your data. There is a third column marked `OFFSET`. The value in this column corresponds to the overall offset in brightness based on a specific calibration used. How nice of them!

An offset of 0 corresponds to no offset needed to be made to the data. A non-zero value corresponds to how much the data is offset by. You want to undo this offset. The offset is not the same for all the points!

Add all the offset values to a new array called `offset`. Use this array to manipulate your brightness data to make it continuous. 

In [None]:
offset = all_info[]

brightness_adjusted =   # operation that will make sure the data is correctly calibrated

In [None]:
# plot your adjusted data as a scatter plot
plt.scatter()


# is it continuous? if not, plot the original data on top to see what might have gone wrong. (hint: is the data "moving" in the right direction?)
# plt.scatter()

## Final Plot and Identification

Now that you did Telescope Facility's job, identification should be a little easier!
Make a final plot with your data and with labels on the axes.

Have a chat with your fellow Dr. Astronomers about what transients you think you have and make sure you get the right one to the right astronomer! If they haven't gotten to this point, ask them if they want some help!

Reminder that there are 4 objects/ transient events: a Cepheid, an AGN, a supernova, and a secret 4th thing. Use the knowledge you've learned from the previous days to identify the objects. Bonus points if you can name the 4th one! (Hint: you've talked about it this week!)

Add an appropriate title to your plot!

In [None]:
plt.scatter()

plt.title()
plt.xlabel()
plt.ylabel()
plt.show()

## Mystery solved! 
Now you can finally go do the science and analyses you originally wanted to do! 
*(But maybe the real science and analyses were the Dr. Astronomers you met along the way)*

If you want to do this project again, try with the other files in the folder!

There are ways to solve the above problems more efficiently using other methods. Go back to your intro notebooks and try to figure it out!