# Introduction

### What is Python? 

- Python is a popular general-purpose programming language that is designed to handle a wide range of problems. 

- Recent developments have extended Python's range of applicability to econometrics, statistics and general numerical analysis. 

### Should we use Python?

- Python (with the right set of add-ons) is comparable to domain-specific languages such as R, MATLAB or Julia. If you are wondering whether you should bother with Python (or another language), 

    - You migt want to consider R if:
        - 1. You want to apply statistical methods (The statistics library of R is second to none).
        - 2. Performance is of secondary importance.
        - 3. Free is important. 
        
    - You might want to consider MATLAB if :
        - 1. Commercial support and a clear channel to report issues is important.
        - 2. Documentation and organization of modules are more important than the breadth of algorithms available.
        - 3. Performance is an important concern.
    - You might want to concern Julia if :
        - 1. Performance in an interactive based language is your most important concern.
        - 2. You like living on the bleeding edge and aren't worried about code breaking across new versions of Julia.
        - 3. You like to do most things yourself.
        
    - Having read the reasons to choose another package, you may wonder why you should consider Python.
        - 1. You need a language which can act as an end-to-end solution that allows access to web-based services, database servers, data management and processing and statistical computation. 
        - 2. Data handling and manipulation – especially cleaning and reformatting – is an important concern.Python is substantially more capable at data set construction than either R or MATLAB.
        - 3. Performance is a concern, but not at the top of the list.
        - 4. Free is an important consideration.
        - 5. Knowledge of Python, as a general purpose language, is complementary to R/MATLAB/Julia/GAUSS/Stata.

<img src="./img/3.png" width="600" height="600">

### Important Components of the Python Scientific Stack

- 1. Python
    - This provides the core Python interpreter. The latest version is Python 3.7, but I personally prefer Python 3.6 due to dependencies issue. 

- 2. NumPy 
    - NumPy provides a set of array and matrix data types which are essential for statistics, econometrics and data analysis. 
- 3. SciPy
    - SciPy contains a large number of routines needed for analysis of data. The most important include a wide range of random number generators, linear algebra routines, and optimizers.
- 4. Jupyter and IPython
    - IPython provides an interactive Python environment which enhances productivity when developing code or performing interactive data analysis. Jupyter provides a generic set of infrastructure that enables IPython to be run in a variety of settings including an improved console (QtConsole) or in an interactive webbrowser based notebook.
- 5. Matplotlib and seaborn 
    - matplotlib provides a plotting environment for 2D plots, with limited support for 3D plotting. seaborn is a Python package that improves the default appearance of matplotlib plots without any additional code.
- 6. Pandas 
    - Pandas provides high-performance data structures.
- 7. Statsmodels
    - statsmodels is pandas-aware and provides models used in the statistical analysis of data including linear regression, Generalized LinearModels (GLMs), and time-series models (e.g., ARIMA).
- 8. scikit-learn (sklearn)
    - scikit-learn provides machine learning models for regression, classification, etc. 
- 9. quantecon
    - The quantecon python library consists of a number of modules which includes game theory (game_theory), markov chains (markov), random generation utilities (random), a collection of tools (tools), and other utilities (util).

### Python Installation

- [Anaconda Distribution](https://www.anaconda.com/download/#windows)

<img src="./img/1.png" width="600" height="600">

- [Anaconda Installer Archive](https://repo.continuum.io/archive/#windows)

### Add not-yet installed packages

-  **If import not installed module then use package manager [pip](https://pypi.org/project/pip/) or [conda](https://conda.io/docs/index.html)** <br>
      >conda install $PACKAGE_NAME <br>
            >>`conda install pip3`  <br>
            >>`OR` 
            >>`pip install pip3`  <br>
            
      >conda update $PACKAGE_NAME      
      
      > conda install -y ipython matplotlib pandas pytables scipy seaborn  OR
      >   OR
      > conda remove pytables 
      
      >conda list
      
      
### Sometimes we need a virtual environment

- required python 2.xx (almost terminated) 
- required python 3.5xx (currently 3.7xx)  <br>
- required 32bit

>set CONDA_FORCE_32BIT=1  __<font color=blue>[only for 32bit required]</font>__<br>    
>conda create -n py32_35 python=3.5

> activate py32_35 or using mouse with window command
> deactivate py32_35

<img src="./img/2.png" width="200" height="200">


### Jupyter? IDE?

- Jupyter Notebook :
    - Popukar interactive development environment.
    - Used for Python, R and Julia.
    
- IDE(Integrated Development Environment)
    - Software application that provides comprehensive facilities to computer programmers for software development. 
    - Pycharm, Sublime, VS code ...
    
- Cloud based
    - [Google Colaboratory](https://colab.research.google.com/notebooks/intro.ipynb#scrollTo=lSrWNr3MuFUS)

In [22]:
import pandas as pd
import numpy as np
import sklearn
import statsmodels

print(pd.__version__)
print(np.__version__)
print(sklearn.__version__)
print(statsmodels.__version__)

1.3.5
1.20.3
1.0.2
0.12.2


### (summary)

# Python is composed of two basic variable types

- (1) Numeric variables
    - int
    - float

- (2) String variables

## Numbers

- The integer numbers (e.g. 2, 4) have type int, the ones with a fractional part (e.g. 5.0) have type float.

- Division (/) always returns a float.

- Can check type of instances with function type().

In [1]:
print(type(4))
print(type(20))
print(type(5.0))

<class 'int'>
<class 'int'>
<class 'float'>


### Python as a simple calculator

- The interpreter acts as a simple calculator: expression in, value out.
- +, -, *, /

##### Example
- $2+2=4$
- $50-5 \times 6 =20$
- ${{50-5 \times 6} \over {4}}=5 $

In [2]:
print(2+2)
print(50-5*6)
print((50-5*6)/4)

4
20
5.0


- To calculate powers, we need to use the ** operator.
- Quotient : //, Remainder : %

In [3]:
print(2**10)
print(13//5)
print(13%5)

1024
2
3


- The equal sign (=) is used to assign a value to a variable. 

In [4]:
price = 1000
quantity = 5

price*quantity

5000

In [5]:
word = 'Economics'
word1 = "'Economics'"
print(word)
print(word1)

Economics
'Economics'


In [6]:
filename = '005930.txt'
print(filename.split('.'))
print(filename.split('.')[0])

['005930', 'txt']
005930


# Sequences

- Sequences are datatypes that can hold various items. 

### Built-in Sequences 

- Container sequences 
    - list, tuple, collections.deque : can hold items of different types.
- Flat sequences
    - str, bytes, bytearray, etc. : hold items of one type.
- Mutable, Immutable

### Tuple

- Immutable
- ()
- accessing through []

### List 

- Mutable
- []
- list related operation

|method | index/value | results|
| :---:   |    :---:      |   :---:  |
|ls[i]=j| index:value| i's element replaced by j |
| ls.append| (value) |    new value are added|
| ls.exend| ([value]) |   new value are added|
|ls.count|(value)|how many times x occur |
|del ls|[index]|delete index value |
|ls.remove|(value)|remove first encountered value|
|ls.pop|(index)|remove index value and return  |
|ls.reverse|()|reverse the ordering|
|ls.sort|()|ordered sorting|

In [7]:
a = [1,2,3]
b = list([1,2,3])
c = list((1,2,3))
print(a == b)
print(a == c)

True
True


### Dictionaries

- Another useful data type built in Python is the dictionary. 
- Unlike sequences, dictionaries are indexed by keys, which can be any immutable type: strings, numbers.
- It is best to think of a dictionary as a set of key: value pairs, with the requirement that the keys are unique (within one dictionary).

In [8]:
score = {'Park':100, 'Kim':60, 'Lee':40}
print(score)
print(score['Park'])
print(score.keys())
print(score.values())
print(score.items())

{'Park': 100, 'Kim': 60, 'Lee': 40}
100
dict_keys(['Park', 'Kim', 'Lee'])
dict_values([100, 60, 40])
dict_items([('Park', 100), ('Kim', 60), ('Lee', 40)])


In [9]:
score['Chung'] = 0
print(score)

{'Park': 100, 'Kim': 60, 'Lee': 40, 'Chung': 0}


# Boolean Values
​
- In programming you often need to know if an expression is True or False.
​
- You can evaluate any expression in Python, and get one of two answers, True or False.
​
- When you compare two values, the expression is evaluated and Python returns the Boolean answer:

In [10]:
print(type(True))
print(type(False))

<class 'bool'>
<class 'bool'>


In [11]:
print(10 > 9)
print(10 >= 9)
print(10 == 9)
print(10 != 9)
print(10 < 9)
print(10 <= 9)

True
True
False
True
False
False


- and(&), or(|)

In [12]:
print((10 > 9) and (10 < 9))
print((10 > 9) & (10 < 9))
print((10 > 9) or (10 < 9))
print((10 > 9) | (10 < 9))

False
False
True
True


# Control flow statements

- Control flow statements : control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated.

- if, while, for ...

### If
- Executes if the statement is true (boolean)
- can add multiple conditions using elif, else.

In [13]:
a, b, c, d = 3, 5, 5, 5

if a+b==10 :
    print('ok')

In [14]:
if c+d==10 :
    print('ok')

ok


In [15]:
c,d = 3,5

if c+d==10 :
    print('ok')
elif c+d==9 :
    print('Umm....')
else : print('no')

no


### Loop(For, While)

- Python’s for statement iterates over the items of any sequence (a list or a string), in the order that they appear in the sequence.
- We can use both the index and value using the enumerate function

In [16]:
for a in [105,62,23,41] :
    print(a+50)

155
112
73
91


In [17]:
# make a list using loops

a = []

for x in range(10) : # 모르는 함수? 괄호 안에 마우스 커서 and shift+tab
    a.append(x**2)
    
a

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]