# Challenge 1: Natural Language Process Overview

### In this challenge, we will present an overview of Natural Language Processing, or NLP. You will watch two videos to have a general understanding about NLP.

## Objectives

* Understand what is NLP.
* Learn about the general evolution of NLP technology.
* Learn about the common areas of research in NLP.
* Learn about the common applications of NLP.
* Understand the general steps to conduct NLP analysis.
* Install NLTK.

## Install NLTK

### Before we start learning, let's first install [Python NLTK](https://www.nltk.org/) because it takes a while to download the package. Follow the steps below.

#### 1. **Install NLTK:** 

If you have installed `pip` on your computer, simply run `pip install nltk`. If you don't have `pip`, follow the instructions [here](https://www.nltk.org/install.html).

1. **Install NLTK data:** 

	Launch Python shell and execute

	```python
	 import nltk
	nltd.download()
	```

The data library is over 3gb and take a while to download. You can view the downloading progress in the download manager that is automatically opened by NLTK. While the data are being downloaded, proceed to the next section.

For more information about downloading NLTK data, refer to [here](https://www.nltk.org/data.html).

In [3]:
## Video 1: [Natural Language Processing Crash Course](https://www.youtube.com/watch?v=fOvTtapxa9c)

Now your NLTK data library should have finished downloading. Confirm that in the download manager. Launch Python shell and test the installation:

```python
from nltk.corpus import brown

In [1]:
import nltk

In [4]:
nltk.download("brown")

[nltk_data] Downloading package brown to
[nltk_data]     C:\Users\aquir\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping corpora\brown.zip.


True

In [5]:
from nltk.corpus import brown
brown.words()[0:10]

['The',
 'Fulton',
 'County',
 'Grand',
 'Jury',
 'said',
 'Friday',
 'an',
 'investigation',
 'of']

In [6]:
brown.tagged_words()[0:10]

[('The', 'AT'),
 ('Fulton', 'NP-TL'),
 ('County', 'NN-TL'),
 ('Grand', 'JJ-TL'),
 ('Jury', 'NN-TL'),
 ('said', 'VBD'),
 ('Friday', 'NR'),
 ('an', 'AT'),
 ('investigation', 'NN'),
 ('of', 'IN')]

In [15]:
text = '''The cat (Felis catus) is a small carnivorous mammal. It is the only domesticated species in the
family Felidae and often referred to as the domestic cat to distinguish it from wild members of the family.
The cat is either a house cat, a farm cat or a feral cat; latter ranges freely and avoids human contact. Domestic
cats are valued by humans for companionship and for their ability to hunt rodents. About 60 cat breeds are recognized
by various cat registries.The cat is similar in anatomy to the other felid species, has a strong flexible body, quick
reflexes, sharp teeth and retractable claws adapted to killing small prey. Its night vision and sense of smell are well
developed. Cat communication includes vocalizations like meowing, purring, trilling, hissing, growling and grunting as
well as cat-specific body language. It is a solitary hunter, but a social species. It can hear sounds too faint or too
high in frequency for human ears, such as those made by mice and other small mammals. It is a predator that is most active
at dawn and dusk. It secretes and perceives pheromones. Female domestic cats can have kittens from spring to late autumn,
with litter sizes ranging from two to five kittens.[9] Domestic cats are bred and shown as registered pedigreed cats, a hobby
known as cat fancy. Failure to control breeding of pet cats by spaying and neutering, as well as abandonment of pets, resulted
in large numbers of feral cats worldwide, contributing to the extinction of entire bird species, and evoking population control.
'''

In [16]:
from nltk import sent_tokenize, word_tokenize

In [17]:
nltk.download('punkt')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\aquir\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [18]:
sent_tokenize(text)

['The cat (Felis catus) is a small carnivorous mammal.',
 'It is the only domesticated species in the\nfamily Felidae and often referred to as the domestic cat to distinguish it from wild members of the family.',
 'The cat is either a house cat, a farm cat or a feral cat; latter ranges freely and avoids human contact.',
 'Domestic\ncats are valued by humans for companionship and for their ability to hunt rodents.',
 'About 60 cat breeds are recognized\nby various cat registries.The cat is similar in anatomy to the other felid species, has a strong flexible body, quick\nreflexes, sharp teeth and retractable claws adapted to killing small prey.',
 'Its night vision and sense of smell are well\ndeveloped.',
 'Cat communication includes vocalizations like meowing, purring, trilling, hissing, growling and grunting as\nwell as cat-specific body language.',
 'It is a solitary hunter, but a social species.',
 'It can hear sounds too faint or too\nhigh in frequency for human ears, such as those