# 1. NLP Introduction

Introduction to Natural Language Processing (NLP).

## Why do we want to process natural language?

- analyzing large pools of data, containing information about human behavior and opinions
- a lot of information is never structured, but kept as plain text in natural language instead:
  - medical record
  - financial statements
  - research logs
- building interfaces for machines to communicate with humans more naturally
  - chatbots
  - voice assistants

## What tasks are in the domain of NLP?

### - text classification
![text classification](https://cdn-images-1.medium.com/max/1600/1*ljCBykAJUnvaZcuPYwm4_A.png)

### - sentiment analysis
![sentiment analysis tree](https://blog.paralleldots.com/wp-content/uploads/2017/09/Sentiment-Analysis-Models-Tree-LSTM.png)

### - language modelling
![bidirectional language model](https://cdn-images-1.medium.com/max/2100/1*HNF-Klzkex58xRkxWvI0dw.png)

### - question answering
![question answering pipeline](https://ai2-s2-public.s3.amazonaws.com/figures/2017-08-08/9d4fba9bfd45c4b89795870bfb16daa83ab87208/2-Figure1-1.png)

### - intent detection
![intent extracion pipeline](https://www.lexalytics.com/images/diagrams/intentions.png)

### - named entities detection
![NER ouput visualization](https://meenavyas.files.wordpress.com/2018/06/namedentityextraction.png)

### - machine translation (depending on the approach)
![Google Translate](https://cdn.dobreprogramy.pro/wp-content/cache/download-manager/T%C5%82umacz-Google-Translate-e9bc49e1340a5835b92d14a2dd50f34f-625x0.jpg)

## What NLP is NOT?

Handwriting recognition can be aided by natural language processing, but is a completely separate process 
(and falls into domain of computer vision).
![HandwritingRecognition](https://i.gifer.com/8GQS.gif)

## What NLP is NOT?

Similarly, voice assistants use NLP, but only after converting speech to text - and this task is outside the scope of NLP.
![TextToSpeech](https://d2w9rnfcy7mm78.cloudfront.net/1471510/original_12186e607ed4af7b50747deb5dc8867c.gif)

## Challenges for NLP

- the meaning of the text is not always clear
- context is often necessary to understand the text

How can we solve these problems?

- structuring the data to make its meaning clear
- applying machine learning to approximate solutions

## How can we structure the data?

Depending on the goal, we can perform different tasks to fit our sentences into a structure that is easier for computers to parse.

For example, we can generate a dependency tree from a sentence, so that relationships between individual words become clear.
This is just one of the things we are going to learn today:

In [2]:
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp("spaCy is the best way to prepare text for deep learning.")
spacy.displacy.render(doc, style='dep', jupyter=True, options={'distance': 100})