#  Plotting Sunspots
This activity looks for patterns in sunspot data from the Sunspot Index and Long-term Solar Observations.

To get started,you won't hurt anything by experimenting. If you break it, close the tab and open the activity again to start over.

When you're ready, run each code cell until you get down to **Part One**.

In [None]:
#importing what we'll need
import numpy as np
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt
pd.options.display.max_columns = 25

In [None]:
# average monthly sunspots from Git file updated on 7/8/20
data = pd.read_csv("https://github.com/fizzixprof/sunspots/raw/master/Sunspots.csv")

data.head(5) # choose to show the first (how many) lines of the file

In [None]:
# The .shape command displays the (number of rows , number of columns) in a file.
data.shape

## Part One
The table above shows the number of sunspots counted each month.
- How far back does the data go?
- How many rows of data do you have?
- How is time represented in this dataset?
When you're ready, run each code cell until you get down to **Part Two**.

In [None]:
# makes the scatterplot
x=data.Year # defines the x-axis
y=data.Sunspots # defines the y-axis
plt.scatter(x,y, s=1) # defines the graph as a scatterplot with x and y
plt.xlabel('x label') # label your x-axis
plt.ylabel('y label') # label your y-axis
plt.title('title me!') # title your graph
plt.axis([1749, 2020, 0, 300]) # dimensions of your scatterplot
plt.show()

# Part Two
The code above makes a scatterplot with a point for the amount of sunspots each month.

Do you notice a pattern in the graph?

The title and axis labels on the graph could use some work. Try editing the code above the graph, then run the code again to see the changes.

 ## Part Three  
Now it's time to conduct your own investigation. The code above lets you filter the data set by year (it's called a "query").  
- Find the line that does a query on "year".  
- Try filtering the data for only sunspots since 1900, then make a plot.
- Try filtering the data for only sunspots between 1900 and 1950. Change the x-axis so you can zoom in on this data.

Is the cycle about the same? How many years is it, approximately?

Cut your data using a different time frame and see if the cycles are about the same length as you see here.

In [None]:
plotdata = data.query('Year < 2000') # redefining the data to plot based on your query
x=plotdata['Year']
y=plotdata['Sunspots']
plt.scatter(x,y, s=1)
plt.xlabel('x label')
plt.ylabel('y label')
plt.title('title me!')
plt.axis([1749, 2020, 0, 300])
plt.show()

  ## Part Four  
Let's find specific values: sort your data to find the 3 months with the greatest number of sunspots. Find the top 20 months.

*   Are they near each other? Similar years?

*   What would happen if you changed your sort to head from tail?
*   Do some of those values surprise you?

You can investigate where numbers in your data come from when you see surprising values. The data [website](http://www.sidc.be/silso/infosnmstot) has a button to read over how they determine sunspot averages.

In [None]:
# Sort the data to find the highest values
# Create a data frame to make sorting easier
df = pd.DataFrame(data) # Creates a data frame from your original data
sort_by_sunspot_number = data.sort_values('Sunspots') # Defines a sorted set
print(sort_by_sunspot_number.tail(n=3)) #Shows just the top 3

In [None]:
# write your own command to see the top 20 months for sunspots

In [None]:
# replace tail with head to find the bottom 20 months for sunspots

  ## Part Five  
Now let's look at the same data a different way.  The original data files are located here: http://www.sidc.be/silso/datafiles

Find the data for yearly totals and download it to your computer. Open in Excel and turn text to columns using semicolons to delineate. Get rid of extra columns and add a row at the top with the headers. Save it and upload it to your GitHub.

Pull your newly created data over for a second round of plotting.

*   What does the scatterplot look like now?


For this data, try connecting your lines instead of keeping the scatterplot. Use a different command from matplotlib.
*   What does the graph look like now?

*   Are your dimensions the same for this dataset?

In [None]:
from google.colab import files
file = files.upload()
data2 = pd.read_csv(list(file.keys())[0], names=["Year", "Sunspots", "Unk", "Unk1", "Unk2"],sep=';',index_col=None)


In [None]:
data2.head()
# plot your data
# the command 'plt.plot(x,y)' will connect lines between points on a scatterplot


  ## Part Six  
What happens if you go back up to Part One and rerun the first graph you made?
*   What does it look like?
*  What happened?
*   What does this tell you about working in a Colab Notebook?





## Saving Your Work  
This is running on a Google server on a distant planet and deletes what you've done when you close this tab. To save your work for later use or analysis you have a few options:  
- File > "Save a copy in Drive" will save it to you Google Drive in a folder called "Collaboratory". You can run it later from there.  
- File > "Download .ipynb" to save to your computer (and run with Jupyter software later)  
- File > Print to print.  
- To save an image of a graph or chart, right-click on it and select Save Image as ...  


---


## Credits
The data come from the [Sunspot Index and Long-term Solar Observations](http://www.sidc.be/silso/datafiles) Source: WDC-SILSO, Royal Observatory of Belgium, Brussels. This notebook was created by physics teacher and Quarknet member Tiffany Coke using a template created by fellow physics teacher Peter Apps, York Middle/High School, Retsof, NY, and further developed by [Adam LaMee](https://adamlamee.github.io/). Thanks to the great folks at [Binder](https://mybinder.org/) and [Google Colaboratory](https://colab.research.google.com/notebooks/intro.ipynb) for making this notebook interactive without you needing to download it or install [Jupyter](https://jupyter.org/) on your own device. Find more activities and license info at [CODINGinK12.org](http://www.codingink12.org).