# [Computational Social Science] 
## 1-1 Anaconda Installation Instructions!

This is a guide to help you properly install and set up Anaconda for the course -- you will want to have this done before you come to the next class so that you will have access to the first lab of the class! 

---

### I. Installation Steps:

1. Go to https://www.anaconda.com/download/

2. Choose appropriate operating system and download **Python 3.8 version**

3. Follow the installation instructions
<div class="alert alert-block alert-info">
<b>IMPORTANT:</b> If you are a **Windows** user, choose to *also* install **Anaconda Prompt** -- this will be your terminal from which you will activate virtual environments and access your labs.
</div>


4. To verify that Anaconda has been installed, open up your terminal and \*\*run the following command: `conda --version`

(\*\*type the exact command as written and hit enter)

If your installation was successful, your message should say `conda` with the version number of your Anaconda. If you get an error at this point, please reach out to the GSI.


---
### II. Set-Up: Creating a Virtual Environment

We create virtual environments to use different version of applications and libraries. Virtual environments allow you to use isolated python environments to install different versions of libraries. Here, we will make a new virtual environment called `legal-studies` and show you how to activate/deactive it.

**Steps:**
1. Open your Terminal (for Windows, open Anaconda Prompt)
2. In your terminal, run the command:
    `conda create -n legal-studies python=3 anaconda`
    
    You've now created your virtual environment!
3. To activate this virtual environment, run: 
    - on Mac or Linux: `source activate legal-studies`
    - on Windows: `activate legal-studies`
4. To deactivate virtual environment: 
    - Mac or Linux: `source deactivate`
    - Windows: `deactivate`
    
    
Remember to always activate your virtual environment first before you install packages or run a notebook! This prevents the potential of crashing your root Python/Anaconda installation.

### III. Navigating Your Directories

At this point, you can run the command to start your Jupyer Notebook server. However, it will open in your home directory and you will have to click through your folders to find the file you want to open. To prevent this, you can **navigate to the desired directory first** in the terminal, and open the server to that directory.

A "directory" is just another term for "folder" -- your Desktop folder is a directory, as are your Downloads, Documents, and OneDrive folders. All you are doing here is laying out the path you will take from your home directory to whichever folder you want to work from.

Here are some basic commands in the terminal:

- `cd <path to directory>`: you can navigate through your directories from your root with the `cd` command by specifying a path to your desired directory.
    - e.g. If your home directory contains your 'Desktop' folder, `cd Desktop` takes you from Home to your Desktop directory.
    - e.g. If your 'Desktop' folder contains folder 'GoBears' which contains the folder 'Oski', the following command from your Home Directory takes you to the Oski folder: `cd Desktop/GoBears/Oski`
- `cd ..`: this allows you to go back to the previous directory (called the parent directory)
    - e.g. 'GoBears' is the parent folder of 'Oski', so from the 'Oski' folder, `cd ..` will take you to the 'GoBears' folder.
- `ls` or `dir`: lists all folders/files in the current directory. This is a good way to check, for example, if your parent folder contains your Desktop folder.

Now you know how to navigate directories from your terminal! Find your desired directory **before** you run the JupyterHub Server to prevent clicking through layers of folders. 

---
### Run Your First Notebook!

Anaconda comes with Jupyter Notebooks which is what we will use throughout this course for all of our labs. In order to run your first notebook:

1. Open your terminal (for Windows users, use Anaconda Prompt)

2. Activate your virtual environment

3. Navigate to your desired directory

4. Run the following command on your terminal: `jupyter notebook`

Your default browser window will open, and you should be in your specified directory. From here, you can create a new notebook, open and edit saved notebooks, and much, much more!

To close the notebook server (and shut down all running notebooks), run the command: `jupyter notebook stop` OR simply hit `Ctrl + c` in your terminal.

---

Borrowed from [Legal Studies 123: Data, Prediction, and Law](https://github.com/Akesari12/LS123_Data_Prediction_Law_Spring-2019/tree/master/labs/Anaconda%20Installation%20Guide)


In [None]:
# 导入必要的包

import pandas as pd

import matplotlib.pyplot as plt

import numpy as np

import statsmodels.api as sm

from sklearn.linear_model import LinearRegression

import networkx as nx

import seaborn as sns

from sklearn.svm import SVR

from sklearn.metrics import mean_absolute_error, mean_squared_error

## 1 Analysis, processing of time series data
There are two main types of factors affecting the flow of people in urban areas, namely temporal information and spatial information.

Before we can forecast spatial urban flows, we need to extract the time series characteristics. Firstly, we will analyse the smoothness of the time series with the help of trend analysis, and then use the appropriate detrending methods to smooth the time series to improve the feasibility of forecasting. Finally, autocorrelation analysis is applied to determine the important forecasting model parameters. Processing and analysing time series can support the feasibility of our time series forecasts and can also help us to improve the forecasting accuracy of the time series.

👇 The code below gives a readout of the time series data. We use the US LasVegas dataset of network features from 1 January 2019 to 16 April 2021 (837 days) as an example to learn how to analyse and process time series data.

In [None]:
# Read time series data and index it using dates
ts = pd.read_csv(
    '/home/mw/input/ts_example3592/LasVegas_File_Out.csv',  # The data path is copied from the corresponding file in the input directory of the file tree on the left
    index_col='dataTime',
    parse_dates=['dataTime'])
ts.head()  # Observation data

#### 1.1.1 Using moving averages to determine trends in the movement of time series  
  
The principle of moving average is to eliminate or attenuate short-term fluctuations** in a time series by ** making the data appear as a trend or cycle of long-term changes.  
  
First we need to define a sliding window of width $N$ and then take the average of each window in the time series to form a new series.  
  
The exact formula is as follows. 
$$  
M_t=\frac{y_t+y_{t+1}+\ldots+y_{t-N+1}}{N}     
$$  
  
- $M_t$ represents the observation at time point t of the new time series
- $y_t$ represents the observation at time point t of the original time series 
- $N$ represents the average number of terms in the sliding window i.e. the width  

  
👇 The code below gives the implementation of the moving average method. After a moving average, we are able to find from the moving average plot of the time series that there is a certain cyclical trend in that time series, for example a moving average in months reveals a sine function-like cyclicality in city activity, but does not reveal a clear long-term increasing or decreasing trend in the city.