# Modeling Sequential Data - NN

- From [23]

## IID - Independent and Identically Distributed

- If we have $n$ data samples, $x^{(1)},x^{(2)},\ldots,x^{(n)}$, the order in which we use the data for training our machine learning algorithms does not matter
  - Example:
    - Order of AND, OR Gate trush table as input to Perceptron doesn't matter.
      - $[[0,0][0,1],[1,0],[1,1]] [0,0,0,1]$
      - $[[1,1],[0,0][0,1],[1,0]] [1,0,0,0]$
      - Any order doesn't matter
    - Perceptron will learn the parameters

## What is Sequential Data

- What makes sequential data unique, from other data types, is that elements in a sequence appear in a certain order, and are not independent of each other
- The $IID$ assumption is not valid in case of _Sequential Data_
- _In Sequential Data_, __*order of data matters*__
  - Example:
    - __*Time Series Data*__

### Sequential Nature of Data

- Various kind of datasets which exibits sequential nature
  - Speech
    - Here data is coming as a word boundaried speech words
      - To understand the what the speech is about, we need the sequence of words spoke
    - Recognizing speech conversation and extracting the context of speech
      - Is it a question?
      - Is it a command to perform some tasks
      - Is it a search request?
      - Example
        - Amazon Alexa
        - Google Assistant
        - Apple Siri
        - etc.,...
  - Documents
    - Here data is coming as a sequence of words
      - To understand the context, the sequence of words need to be processed
    - Understanding what the sentence meaning is?
      - Is it a question asking varioud documents related to NLP?
  - Videos
    - Process incoming video and audio, which is a sequence of data and provide necessary output
  - Weather Forecast
    - Here historical weather information/report is the sequence
    - Given historical weather report, predict weather (whether it is going to rain today or not?)
  - Financial - Stock Market
    - Here sevaral different variables related to stock market is available as a sequence
    - Example
      - Given historical stock market, predict market status
      - Why the market is down today? Is there any similar situation that can be found from historical information?

![Sequential_Nature_Of_Data](images/Sequential_Nature_Of_Data.jpg)

### Representating Sequences

- $x^{(1)},x^{(2)},\ldots,x^{(T)}$
- Example, time-series data
  - $x^{(t)}$ represnts data belonging to particular time $t$
  - ![Sequence_Data_Representation](images/Sequence_Data_Representation.jpg)

## Limitations of MLP, CNN

- Standard Neural Network models, MLP (Multi Layer Perceptron), CNN (Convolution Neural Network) are __*not capable of handling the order of input samples*__
  - Intuitively, these models __*do not have a memory*__ of the past samples
  - Data are passed through _Feed Forward_ and _Backpropagation_ steps, and the weights are updated _independent_ of the order in which the sample is passed
- From [24]
  - They __*accept fixed-sized vectors*__ as input and __*produce a fixed-size vectors*__ as output
    - E.g., Probabilities of different classes
  - These models perform this mappingusing a fixed amout of computationanl steps
    - i.e., the number of layers in the model

## Capability of RNN

- RNN, are __*designed for modeling sequences*__ and are capable of __*remembering past*__ information and __*processing new events*__ accordingly
- From [24]
  - RNN allows us to operate over _sequence of vectors_
    - Sequence in the input
    - Sequence in the output
    - Or Sequece in both input and output

## Applications where Sequence Modeling is used

- Language Translation
- Image Captioning
  - Example: [Deep Visual-Semantic Alignments for Generating Image Descriptions](https://cs.stanford.edu/people/karpathy/deepimagesent/)
- Text Generation

## Categories of Sequence Modeling

- There are different types of sequence modeling exists to develop an appropriate model
  - __Many-to-One__
    - The input data is a sequence, but the output is a fixed-sized vector, not a sequence
      - E.g., _Sentiment Analysis_
        The input text-based and the output is class-label
  - __One-to-Many__
    - The input is in a standard data format, not a sequence, but the output is a sequence
      - E.g., _Image Captioning_
        - Input is an Image and the output is an English phrase
  - __Many-to-Many__
    - Both input and output are sequences
    - This can be further categorized into whether the input and output are synchronized or not
      - __Synchronized__
        - E.g., _Video Classification_
          - Each frame in a video is labeled
      - __Delayed__
        - Translating a language into another
          - E.g., English to German

Image 1             |  Image 2
:-------------------------:|:-------------------------:
!![Sequence_Model_Categories_1](images/Sequence_Model_Categories_1.jpg)   |  ![Sequence_Model_Categories_2](images/Sequence_Model_Categories_2.jpg)   

## Study Links

- [24]
- [Deep Visual-Semantic Alignments for Generating Image Descriptions](https://cs.stanford.edu/people/karpathy/deepimagesent/)
- [Find Structure in Time](https://crl.ucsd.edu/~elman/Papers/fsit.pdf)
  - Explains how time is represented in NN
  - Uses XOR problem as a time series data and details RNN

# Finding Structure in Time

- Time underlines many interesting human behaviours
- How do we represent the time in connectionist models?
  - One approach is to represent time implicitly by its effect on processing rather than  explicitly (as in a spatial representation, not as an additional dimension of the input)
  - The current approach is use of recurrent links in order to provide networks with a dynamic memory
  - In this approach, hidden unit patterns are fed back to themselves; the internal representations which develop thus reflect task demands in the context of prior internal states.