# Essay Writing using AI

*   Part 0: Setup workspace
*   Part 1: Loading Machine learning model on any text dataset for free on a GPU using Collaboratory
*   Part 2: Training machine learning model on specific topics 
*   Part 3: Start writing essay

In [9]:
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

In [10]:
%matplotlib inline

import os, sys 
import logging, io, json, warnings
logger = logging.getLogger()
logger.setLevel(logging.CRITICAL)
warnings.filterwarnings('ignore')

In [3]:
%load_ext autoreload
%autoreload 2

## Set up workspace (Mounting Google Drive)

1. Mount your google drive
2. Add path to the system

In [8]:
## This is a goodle 
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [5]:
mkdir gdrive/'My Drive'/dsscamp

mkdir: cannot create directory ‘gdrive/My Drive/dsscamp’: File exists


In [12]:
cd /content/gdrive/'My Drive'/dscamp2

/content/gdrive/My Drive/dscamp2


In [7]:
ls dscamp_public/'NLP 2'/Essay_Writing/

AI_EssayWriting.ipynb  [0m[01;34mcodes[0m/  [01;34mdatasets[0m/  readme.md


In [13]:
#codepath = os.path.join(nb_path, 'codes')
codepath = os.path.join(os.getcwd(), 'dscamp_public/NLP 2/Essay_Writing/codes')
sys.path.append(codepath)

### Install libraries

In [29]:
!pip install transformers==2.10

Collecting transformers==2.10
[?25l  Downloading https://files.pythonhosted.org/packages/12/b5/ac41e3e95205ebf53439e4dd087c58e9fd371fd8e3724f2b9b4cdb8282e5/transformers-2.10.0-py3-none-any.whl (660kB)
[K     |▌                               | 10kB 25.2MB/s eta 0:00:01[K     |█                               | 20kB 6.0MB/s eta 0:00:01[K     |█▌                              | 30kB 7.3MB/s eta 0:00:01[K     |██                              | 40kB 8.0MB/s eta 0:00:01[K     |██▌                             | 51kB 7.1MB/s eta 0:00:01[K     |███                             | 61kB 8.0MB/s eta 0:00:01[K     |███▌                            | 71kB 7.9MB/s eta 0:00:01[K     |████                            | 81kB 8.9MB/s eta 0:00:01[K     |████▌                           | 92kB 8.1MB/s eta 0:00:01[K     |█████                           | 102kB 8.1MB/s eta 0:00:01[K     |█████▌                          | 112kB 8.1MB/s eta 0:00:01[K     |██████                          | 122

### GPU
Colaboratory uses either a Nvidia T4 GPU or an Nvidia K80 GPU. The T4 is slightly faster than the old K80 for training GPT-2, and has more memory allowing you to train the larger GPT-2 models and generate more text.

You can verify which GPU is active by running the cell below.

In [1]:
import numpy as np
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel

In [3]:
MAX_LENGTH = int(10000)

In [4]:
def set_seed(seed):
    np.random.seed(seed)
    torch.manual_seed(seed)
    if args.n_gpu > 0:
        torch.cuda.manual_seed_all(seed)

In [14]:
from main import GenerateSentence 

## Loading machine learning model
This machine learning model is trained on a very large corpus of ~40 GB of text data. The largest model size is huge with 1.5 billion parameters, trained on a dataset of **8 million** web pages.

In [15]:
generate_sentence = GenerateSentence(dataset_path='dscamp_public/NLP 2/Essay_Writing/datasets')

'Language Generator loaded successfully....'


In [16]:
import nltk
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

## Train model on specific topics for free on Google Colab GPUs
Option to provide your own datasets is also available.
The defaults topics to choose from:

1. Artificial intelligence
2. Machine learning
3. History of United States

More topics will be added....

In [17]:
generate_sentence.train_on_topics('ai')

HBox(children=(FloatProgress(value=0.0, max=3.0), HTML(value='')))




## Start Writing Essays

At each step five options will be provided

* A -> AI option
* B -> AI option
* C -> AI option
* D -> User can choose to add sentences
* E -> STOP the writing process

In [19]:
generate_sentence.start_writing()

Write the first sentence >>> AI is an important aspect of modern life.


In [20]:
generate_sentence.generate_sentences()

::::YOURS OPTIONS ARE :::
A. --> 

B. --> 

C. --> 

D. --> Write your own sentences
E. --> STOP ESSAY WRITING
Choose your option >>> D
Insert your own sentence >>> You can see applications of AI everywhere.
 
 
 
 ****** ESSAY TILL THIS POINT *******
AI is an important aspect of modern life. You can see applications of AI everywhere. 



::::YOURS OPTIONS ARE :::
A. --> 
The first time I saw this, it was in the news that a lot of people were using Google's DeepMind to do their own research on how they could create and use artificial intelligence (AIs).

B. -->  It's a very interesting thing to do, and I think it would be good for the future.

C. --> 

D. --> Write your own sentences
E. --> STOP ESSAY WRITING
Choose your option >>> A
 
 
 
 ****** ESSAY TILL THIS POINT *******
AI is an important aspect of modern life. You can see applications of AI everywhere. The first time I saw this, it was in the news that a lot of people were using Google's DeepMind to do their own research on how

In [None]:
generate_sentence = GenerateSentence(dataset_path='dscamp_public/NLP 2/Essay_Writing/datasets')

'Language Generator loaded successfully....'


In [None]:
generate_sentence.train_on_topics('history')

HBox(children=(FloatProgress(value=0.0, max=7.0), HTML(value='')))




In [None]:
generate_sentence.start_writing()

Write the first sentence >>> George Washington is an important figure in American history.


In [None]:
generate_sentence.generate_sentences()

::::YOURS OPTIONS ARE :::
A. -->  He was a member of theocratic and anti-Catholic Church, he served as president for many times before being elected to Congress but never once ran against it again.

B. --> 
The first president of the United States was born on March 6, 1776 and died at age 85 (18 years old).

C. -->  He was born on April 1, 1876 at the home of his mother and father who died when he had a heart attack during World War II
The first lady's name came from her husband George W.

D. --> Write your own sentences
E. --> STOP ESSAY WRITING
{'A': 'George Washington is an important figure in American history. He was a member of theocratic and anti-Catholic Church, he served as president for many times before being elected to Congress but never once ran against it again.', 'B': 'George Washington is an important figure in American history.\nThe first president of the United States was born on March 6, 1776 and died at age 85 (18 years old).', 'C': "George Washington is an important