---
title: "Session 01: Introduction"
---

## Summary

We didn't discuss too much on statistics or quantitative methods. 

Rather, we tried to import dataset into SPSS and played around with datasets below:

- `mtcars`: a built-in dataset in `base-r`, wrapped in an `.xlsx` file for some reason. 
- `temphr10.sav`: a dataset came from the textbook, looks like a series of weather data. 

## Implementations
Since I use python for my daily work, the data were loaded to pandas. 

In [1]:
# Also, for descriptive statistics, pandas itself is enough for the job. 
import pandas as pd
import seaborn as sns

In [2]:
# Load the SPSS dataset called "temphr10.sav".
temphr10 = pd.read_spss('./datasets/temphr10.sav')
# I converted the excel file to csv. 
mtcars = pd.read_csv('./datasets/mtcars.csv')


### In SPSS: Measures of variables


In [3]:
# What's in the variable view in SPSS
temphr10.columns

Index(['sex', 'hr', 'temp_Fahrenheit', 'temp_Celcius', 'Likert_rating'], dtype='object')

In SPSS: 
- Nominal: labels, e.g. *races or gender / sex*
- Ordinal: ranking
- Interval: ordinal, equally spaced, e.g. *temperature, IQ*
- Ratio: interval + a true zero point, e.g. *ago, wage, height, weight (it can be 0)*

| Level of Measurement | Valid Operations                                  |
|----------------------|---------------------------------------------------|
| Nominal              | $=$, $\neq$                                       |
| Ordinal              | $=$, $\neq$, $<$, $>$                             |
| Interval             | $=$, $\neq$, $<$, $>$, $+$, $-$                   |
| Ratio                | $=$, $\neq$, $<$, $>$, $+$, $-$, $\times$, $\div$ |

## Descriptive Statistics

In [5]:
# Ways to do descriptive things in pandas.
temphr10['sex'].describe()

count       10
unique       2
top       Male
freq         7
Name: sex, dtype: object

In [6]:
temphr10['Likert_rating'].value_counts()

Likert_rating
Neutral or Don't Know    3
Agree                    2
Disagree                 2
Strongly Disagree        2
Strongly Agree           1
Name: count, dtype: int64

In [7]:
temphr10['hr'].value_counts()

hr
75.0    2
70.0    1
69.0    1
71.0    1
62.0    1
74.0    1
80.0    1
73.0    1
82.0    1
Name: count, dtype: int64