---
author: Ni Shi (shi_ni@bentley.edu)
---






We're going to create the `Pressure` dataset as example data.
It contains observations of pressure and temperature.
You would use your own data instead.





In [None]:
import pandas as pd
pressure = pd.DataFrame( {
    'temperature': [0,20,40,60,80,100,120,140,160,180,200,
                    220,240,260,280,300,320,340,360],
    'pressure':    [0.0002,0.0012,0.0060,0.0300,0.0900,0.2700,0.7500,
                    1.8500,4.2000,8.8000,17.3000,32.1000,57.0000,96.0000,
                    157.0000,247.0000,376.0000,558.0000,806.0000]
} )
pressure

Unnamed: 0,temperature,pressure
0,0,0.0002
1,20,0.0012
2,40,0.006
3,60,0.03
4,80,0.09
5,100,0.27
6,120,0.75
7,140,1.85
8,160,4.2
9,180,8.8



Let's model temperature as the dependent variable with the logarithm of pressure
as the independent variable.  To transform the independent variable pressure, we use NumPy's `np.log` function, as shown below.  It uses the natural logarithm (base $e$).









In [None]:
import numpy as np

# Compute the logarithm of pressure
X = pressure[['pressure']]
log_X = np.log(X)

# Build the linear model using Scikit-Learn
from sklearn.linear_model import LinearRegression
y = pressure['temperature']
log_model = LinearRegression()
log_model.fit(log_X, y)

# Display regression coefficients and R-squared value of the model
log_model.intercept_, log_model.coef_, log_model.score(log_X, y)

(153.97045660511063, array([23.78440995]), 0.9464264282083346)


The model is $\hat t = 153.97 + 23.784\log p$,
where $t$ stands for temperature and $p$ for pressure.

Another example transformation is the square root transformation.  As with `np.log`,
just apply the `np.sqrt` function to the appropriate term when defining the model.





In [None]:
# Compute the square root of pressure
X = pressure[['pressure']]
sqrt_X = np.sqrt(X)

# Build the linear model using Scikit-Learn
from sklearn.linear_model import LinearRegression
y = pressure['temperature']
sqrt_model = LinearRegression()
sqrt_model.fit(sqrt_X, y)

# Display regression coefficients and R-squared value of the model
sqrt_model.intercept_, sqrt_model.coef_, sqrt_model.score( log_X, y )

(98.56139249917803, array([11.44621468]), 0.29600246256782614)


The model is $\hat t = 98.561 + 11.446\sqrt{p}$,
with $t$ and $p$ having the same meanings as above.






