<a href="https://colab.research.google.com/github/kangwonlee/nmisp/blob/lecture-idea/28_interpolation/00_interpolation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Interpolation<br>내삽법



For instance, to guess values between two measurements, we may try interpolation.<br>예를 들어 실험을 통해 얻은 두 측정값 사이의 값을 추정하고 싶을 경우, 내삽법을 시도해 볼 수 있다.



Let's take a look at the following table.<br>아래의 표를 살펴보자.



In [None]:
# Import pandas for tables
import pandas as pd
# Import NumPy and matplotlib
import pylab as py



In [None]:
# What is this?
py.seed()

# Parameters
a = 0.5
b = 1.5

# x array
x_array = py.arange(5+0.5)

# True values of y
y_true = a * x_array + b

# contamination
noise = py.random(x_array.shape) - 0.5

# Measurement values
y_measurement = y_true + noise

# Organize data in table form
# https://stackoverflow.com/questions35160256
df = pd.DataFrame(
    {
        'x':x_array,
        "y_measurement":y_measurement
    },
    columns=['x', "y_measurement"],
)

# Plot data points
ax = df.plot.scatter(x='x', y="y_measurement", label="y_measurement")

py.show()



In [None]:
# Present the table
df



Let's try to figure out $y$ values in the $0 \le x \le 1$ interval.<br>여기서 $0 \le x \le 1$ 구간의 $y$ 값을 알아보자.



## Linear interpolation<br>선형 내삽



### Formulation<br>수식화



We can formulate the straight line passing two points of $(x_1, y_1)$ and $(x_2, y_2)$.<br>두 점 $(x_1, y_1)$, $(x_2, y_2)$ 을 지나는 직선의 방정식을 구할 수 있다.



In [None]:
# Import symbolic processor module
import sympy as sym

# Initialize printing equations
sym.init_printing()



In [None]:
# Declare symbols
x = sym.symbols('x')

# Multiple symbols using `:`
x1, x2 = sym.symbols('x1:3')
y1, y2 = sym.symbols('y1:3')

# Define slope
slope = (y2 - y1) / (x2 - x1)

# Define the straight line
y_interp = slope * (x - x1) + y1

# Present the equation
y_interp



Or we may rewrite as follows.<br>$x$에 관해 정리하면 다음과 같을 것이다.



In [None]:
sym.collect(sym.expand(y_interp), x, sym.factor)



We can find $y_i$ for an arbitrary $x_i$ (within $0 \le x \le 1$ interval) as follows.<br>
($0 \le x \le 1$ 구간에서) 임의의 $x_i$ 에 대응되는 $y_i$ 는 다음과 같이 구할 수 있다.



In [None]:
# Declared x_i as a SymPy symbol
x_i = sym.symbols('x_i')

# Prepared a dictionary containing substitution pairs
substitution_dict = {
    # "substitute x with x_i"
    x: x_i,
    x1: x_array[0],
    x2: x_array[1],
    y1: y_measurement[0],
    y2: y_measurement[1],
}

# Substitution
y_i_sy = y_interp.subs(substitution_dict)

# Result of substitution
y_i_sy



SymPy may generate expressions in programming languages.<br>프로그래밍 언어 구문을 생성하는 것도 가능하다.



In [None]:
python_code = sym.python(y_interp)
print(python_code)



In [None]:
c_code = sym.ccode(y_interp)
print(c_code)



In [None]:
fortran_code = sym.fcode(y_interp)
print(fortran_code)



### Practice<br>적용사례



Usually we call `interp()` function.<br>보통 `interp()` 함수를 이용한다.



In [None]:
# x values to interpolate
x_i = py.linspace(x_array[0], x_array[-1], 50+1)

# Interpolate
y_i = py.interp(x_i, x_array, y_measurement)



In [None]:
# Plot data points
ax = df.plot.scatter(x='x', y="y_measurement", label="y_measurement")

# Plot interpolation
ax.plot(x_i, y_i, '.', label='$y_{interp}$')

# Show legend table
py.legend(loc=0)

py.show()



### `pandas`



`DataFrame` of `pandas` also has simple interpolation features.<br>판다스의 데이터프레임도 간단한 내삽 기능이 있다.



In [None]:
df_interp_nan = df.reindex(x_i)



In [None]:
# http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.interpolate.html
df_interp = df_interp_nan.interpolate(method='linear')



In [None]:
# Plot interpolation
ax = df_interp.plot.scatter(x='x', y="y_measurement", label='$y_{interp}$', c='orange')
df.plot.scatter(x='x', y="y_measurement", ax=ax, label="y_measurement")

# Show legend table
py.legend(loc=0)

py.show()



## Cubic spline curve<br>3차 스플라인 곡선



A [spline](https://en.wiktionary.org/wiki/spline) is a ruler made of a piece of thin and long rectangular wood.  Drafters used it draw a smooth curve.<br>
[스플라인](https://en.wiktionary.org/wiki/spline)은 얇고 긴 나무자를 말한다. 부드러운 곡선을 그리기 위해 사용했었다.



<a href="https://en.wikipedia.org/wiki/Flat_spline">
    <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/f/fd/Spline_(PSF).png/1200px-Spline_(PSF).png" alt="Spline" width="200"/>
</a>



"Cubic" here means that we would interpolate using a 3rd order polynomial.<br>
여기서 "3차"는 내삽할 때 3차 다항식을 사용한다는 의미이다.



$$
y = a_0 x^3 + a_1 x^2 + a_2 x + a_3
$$



### SciPy



The following cell first instantiate a cubic interpolator `cubic_interp` and use it.<br>
아래 셀에서는 3차 다항식을 이용하는 내삽기 `cubic_interp` 를 만들어서 사용한다.



In [None]:
# https://www.scipy-lectures.org/intro/scipy.html#interpolation-scipy-interpolate

# Import interpolation subpackage
import scipy.interpolate as sn

cubic_interp = sn.interp1d(x_array, y_measurement, kind='cubic')
y_cubic = cubic_interp(x_i)



In [None]:
# Plot data points
ax = df.plot.scatter(x='x', y="y_measurement", label="y_measurement")

# Plot linear interpolation
ax.plot(x_i, y_i, '.', label='$y_{linear}$')

# Plot cubic spline curve
ax.plot(x_i, y_cubic, 'x', label='$y_{cubic}$')

# Show legend table
py.legend(loc=0)

py.show()



### pandas



In [None]:
# http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.interpolate.html
df_interp = df_interp_nan.interpolate(method='cubic')



In [None]:
# Plot interpolation
ax = df_interp.plot.scatter(x='x', y="y_measurement", label='$y_{interp}$', c='orange')
df.plot.scatter(x='x', y="y_measurement", ax=ax, label="y_measurement")

# Show legend table
py.legend(loc=0)

py.show()



Output formatting<br>출력 양식 설정
 



In [None]:
pd.set_option('display.float_format', '{:.2g}'.format)



Number of output lines<br>
출력 행 수 설정



In [None]:
pd.options.display.max_rows = 700



## Exercises<br>연습 문제



Try this 1 : Make a table of $sin \theta^\circ$ within $0(^\circ)$ ~ $360(^\circ)$ with interval of 10 degrees. Also plot it.<br>
도전 과제 1 : $0(^\circ)$ ~ $360(^\circ)$ 구간에서 10도 간격으로 $sin \theta^\circ $ 값의 표를 만드시오. 그래프로도 표시해보시오.



Try this 2 : Estimate  $sin \theta^\circ$ values with interval of 1 degree using the values of the table above.  Compare with the result of `py.sin()` on a plot.<br>
도전 과제 2 : 위 표의 값을 이용하여 1도 간격으로 $sin \theta^\circ$ 값을 추정하시오.  `py.sin()` 결과와 그래프로 비교해 보시오.



## 참고문헌<br>References



* 맥키니 저, 김영근 역, 파이썬 라이브러리를 활용한 데이터 분석, 2판, 한빛미디어, 2019, ISBN 979-11-6224-190-5 ([코드와 데이터](https://github.com/wesm/pydata-book/)) <br>Wes McKinney, Python for Data Analysis, 2nd Ed., O'Reilly, 2017. ([Code and data](https://github.com/wesm/pydata-book/))



## Final Bell<br>마지막 종



In [None]:
# stackoverfow.com/a/24634221
import os
os.system("printf '\a'");

