changes to exercise formatting

mike-ivs · Sep 14, 2023 · 638cb1b · 638cb1b
1 parent f5d8460
commit 638cb1b
Showing 1 changed file with 7 additions and 13 deletions.
diff --git a/_episodes/02-regression.md b/_episodes/02-regression.md
@@ -250,16 +250,13 @@ plt.show()
 Comparing the plots and errors it seems like a polynomial regression of N=2 is a far superior fit to Dataset II than a linear fit. In fact, it looks like our polynomial fit almost perfectly fits Dataset II... which is because Dataset II is created from a N=2 polynomial equation!
 
 > ## Exercise: Perform and compare linear and polynomial fits for Datasets I, III, and IV. 
-> 1. Which performs better for each dataset?
-> 2. Modify your polynomial regression function to take `N` as an input parameter to your regression model. How does changing the degree of polynomial fit affect each dataset?
+> Which performs better for each dataset? Modify your polynomial regression function to take `N` as an input parameter to your regression model. How does changing the degree of polynomial fit affect each dataset?
 > > ## Solution
-> > Question 1.
 > > ~~~
 > > for ds in ["I","II","III","IV"]:
 > >     # Sort our data in order of our x (feature) values
 > >     data_ds = data[data["dataset"]==ds]
 > >     data_ds = data_ds.sort_values("x")
-> > 
 > >     fit_a_linear_model(data_ds["x"],data_ds["y"])
 > >     fit_a_poly_model(data_ds["x"],data_ds["y"])
 > > 
@@ -270,13 +267,11 @@ Comparing the plots and errors it seems like a polynomial regression of N=2 is a
 > > The `N=2` polynomial fit is far better for Dataset II. According to the RMSE the polynomial is a slightly better fit for Datasets I and III, however it could be argued that a linear fit is good enough. 
 > > Dataset III looks like a linear relation that has a single outlier, rather than a truly non-linear relation. The polynomial and linear fits perform just as well (or poorly) on Dataset IV. 
 > > For Dataset IV it looks like `y` may be a better estimator of `x`, than `x` is at estimating `y`.
-> > 
-> > Question 2.
 > > ~~~
 > > def fit_a_poly_model(x,y,N):
 > >     # Define our estimator/model(s)
 > >     poly_features = PolynomialFeatures(degree=N)
-> >     ...
+> >     # ...
 > > ~~~
 > > {: .language-python}
 > >
@@ -286,22 +281,21 @@ Comparing the plots and errors it seems like a polynomial regression of N=2 is a
 > >     # Sort our data in order of our x (feature) values
 > >     data_ds = data[data["dataset"]==ds]
 > >     data_ds = data_ds.sort_values("x")
-> > 
 > >     fit_a_linear_model(data_ds["x"],data_ds["y"])
 > >     for N in range(2,11):
-> >       print("Polynomial degree =",N)
-> >       fit_a_poly_model(data_ds["x"],data_ds["y"],N)
-> > 
+> >         print("Polynomial degree =",N)
+> >         fit_a_poly_model(data_ds["x"],data_ds["y"],N)
 > >     plt.show()
 > > ~~~
 > > {: .language-python}
 > >
 > > With a large enough polynomial you can fit through every point with a unique `x` value. 
-> > Datasets II and IV remain unchanged beyond `N=2` as the polynomial has converged (dataset II) or cannot model the data (Dataset IV). 
-> > Datasets I and III slowly decrease their RMSE and N is increased, but it is likely that these more complex models are overfitting the data (overfitting is discussed later in the lesson).
+> > Datasets II and IV remain unchanged beyond `N=2` as the polynomial has converged (dataset II) or cannot model the data (Dataset IV).
+> > Datasets I and III slowly decrease their RMSE and N is increased, but it is likely that these more complex models are overfitting the data. Overfitting is discussed later in the lesson.
 > {: .solution}
 {: .challenge}
 
+
 ## Let's explore a more realistic scenario
 
 Now that we have some convenient Python functions to perform quick regressions on data it's time to explore a more realistic regression modelling scenario.