* **`App`** - [Ai Comic Generator]()

* **`Full Project`** - [Ai Comic Generator](https://github.com/sanskarGupta551/AI_work/tree/main/Portfolio_project___Ai_Comic_Generator) 

# **1.1 Story Generation**

* After taking the `User Inputs`, the next step is to write the `Story-telling` of the Comics. 
* In this notebook, we will build the entire pipeline for `Story Generation` for our `Comics` that uses specific `Prompts` to generate high quality `stories` with great context.  

'''

### **Generating Stories using specific Prompts**

* Building a model that generates text and stories using `specific prompts` involves a few key steps. 
* These include `understanding` the concept of `prompt engineering`, identifying `the task` and purpose of your prompt, being `specific` with your prompt, and `refining your model` through an iterative approach.

##### **Prompt Engineering**

* `Prompt engineering` is a critical concept when it comes to generating text using AI models. 
* It involves crafting a `framework` OR a `set of instructions` that the AI can use to generate the desired text. 
* The `GPT` model, for example, uses the prompt to `determine` the structure and content of the `text` it needs to generate.

##### **Identify the Task and Purpose of Your Prompt**

* Before you start creating your prompt, it is vital to know what task you want the model to accomplish and what the `desired result` should be. 
* This could be `generating a narrative`, acquiring knowledge, or another type of artistic expression. 
* Having a clear `understanding` of the task at hand will make it easier to compose a `prompt` that is specific and relevant to the topic at hand.

##### **Be Specific In Your Writing**

* The `quality` and relevance of the generated `text` will be greatly influenced by the `specificity` of your `prompt`. 
* For instance, if you're using a story generation model, you could start with an `input` like "Once upon a time in a town called XYZ" instead of a more general input like "Once upon a time".

##### **Iterative Refinement**

* It's important to note that `generating` high-quality text using AI models is an `iterative process`. 
* This means that you may need to `refine` your prompt or adjust your `model parameters` based on the quality of the generated text. 
* This process of refinement and optimization can help you achieve more accurate and natural-sounding results.

'''

### **Pipeline** 

* In a detailed outline, our **Story Generation Pipeline** will work something like this - 

1. **User** > `User Inputs`  
2. `User Inputs` > **Prompts Generator** > `Prompts` > **Story Generator** > `Story` 
3. `Story` > **Summarizer** > `Story Plot Summary` & `Comic Title` 

'''

#### **Process**

* We will solve the task of **Story Generation** through following process - 
    * [1. User Inputs](#1)
    * [2. Prompts Generator](#2)
    * [3. Story Generator](#3)
    * [4. Summarizer](#4)

***
***

## **1. User Inputs** <a id="1"></a>

* The following `User Inputs` that will influence the `Story` elements of the `Comic`.

##### **a. Comic Idea** :- 
* An **Optional** `Text` Input. 
* General Idea of the Comic. 

##### **b. Have Dialogues** :- 
* A `Boolean` Input. 
* Determines whether the Comic will have Dialogues. 

##### **c. Literary Elements** :- 
* A list of `Options` to choose from in `Text` format. 
* Determines the flavour of the Comics. 

Below are the `Literary elements` with the `Options` listed for them.

* `Style` :- Plot First, Full Script, Classic Comic Strip, Western Comic Style, Modern Age Styles, ~Pick Any 
* `Tone` :- Joyful, Serious, Humorous, Sad, Formal, Informal, Optimistic, Pessimistic, Horror, ~Pick Any 
* `Mood` :- Cheerful, Humorous, Idyllic, Madness, Melancholic, Suspenseful, Romantic, Horrrifying, Inspirational, ~Pick Any 
* `Pace` :- Fast-paced, Slow-paced, Variable pacing, Steady pacing, ~Pick Any 
* `Theme` :- Fantasy, Reality, Autobiography, Humor, Politics, Love and Freindship, Courage and Heroism, Good vs Evil, ~Pick Any 

##### **d. Conflict** :- 
* An **Optional** `Text` input. 
* Plot Conflict around which the Story revolves. 

##### **e. Villain** :- 
* An **Optional** `Text` input. 
* Describes the main Antagonist in the Comic if any. 

##### **f. Ending** :- 
* An **Optional** `Text` input. 
* Describes an Ending the User prefers to see. 

##### **g. Keywords** :- 
* An **Optional** `Text` input.
* Additional keywords that determine the features of the story-telling.

***
***

## **2. Prompts Generator** <a id="2"></a> 

* There are some things to mention here - 
    * First, we only need `prompts` for `one` purpose - **Story Generation**. 
    * Secondly, our `User Inputs` are `fixed` and `predictable`. 
* This means that we can use a `simple function` that creates a `string` with all the `User Inputs` filled in right places, and then use prompts as such to `Generate` Stories and `Training` the Model. 

'''

* Below is the list of `User Inputs` that we will use in our `Prompts` - 
    * Comic Idea 
    * Have Dialogues 
    * Literary Elements `(Style, Tone, Mood, Pace, Theme)` 
    * Conflict 
    * Villain 
    * Ending 
    * Keywords 
* Let's create a `Prompt Generator` function based on these `Inputs`.

In [13]:
def Prompt_Generator(comic_idea, have_dialogues, style, tone, mood, pace, theme, 
                     conflict, villain, ending, keywords): 

    prompt = f"""Write a {style}, {tone}, {mood}, {pace}, {theme} story. The story is based on - {comic_idea}. The story will {have_dialogues}have dialogues. The conflict is {conflict}. The villain is {villain}. The ending is {ending}. More keywords are {keywords}.""" 
    return prompt 


* Now let's assign these `User Inputs` a random value for testing that can be used in the `Prompts`. 

In [14]:
comic_idea = "A Dog in space" 
have_dialogues = "" 
style = "Plot First" 
tone = "Joyful" 
mood = "Cheerful" 
pace = "Fast-paced" 
theme = "Fantasy" 
conflict = "A sandwich" 
villain = "A cute teddy bear" 
ending = "A good ending" 
keywords = "suspense, thriller, action, travelling" 

In [15]:
Prompt_Generator(comic_idea, have_dialogues, style, tone, mood, pace, theme, 
                     conflict, villain, ending, keywords)

'Write a Plot First, Joyful, Cheerful, Fast-paced, Fantasy story. The story is based on - A Dog in space. The story will have dialogues. The conflict is A sandwich. The villain is A cute teddy bear. The ending is A good ending. More keywords are suspense, thriller, action, travelling.'

***
***

## **3. Story Generator** <a id="3"></a> 

* Here, we will build and `Train` a `Deep Learning Model` that takes instructions from `prompts` and generates `stories`. 
* Even though `Prompt Generation` is done earlier in the pipeline, `Story Generation` being the `focal` point here, we will begin with it. 

#### **Story Generation Method - `Neural Network-Based Story Generation`**

* This method involves `training neural networks` on a corpus of text, enabling them to generate a wide range of stories on various topics. 

* Some approaches use a `language model` as a resource to answer questions about a story world, with the answers becoming the content of the story itself.  

* Other methods, like the `C2PO` system, use neural networks like `COMET` to infer what might come next in a story, creating a directed acyclic graph of plausible successors and predecessors until a complete path from start to end is found. 

* Another approach in neural story generation is the `controllable story generation`. In this method, `key points` in the story are provided, and the model is tasked with `in-filling` the narrative. For instance, given `a beginning and an ending`, the model attempts to generate the `middle part` of the story, ensuring overall `coherence`.

* Neural network-based systems can `generate` a wider range of stories but may `struggle` with maintaining coherence over `longer` narratives. 

'''

#### **Process** 

* [a. Building a Custom Dataset](#11)
* [b. Building a GPT Model](#12)
* [c. Training the GPT Model](#13)
* [d. Evaluating the Model](#14)

***

### **a. Building a Custom Dataset** <a id="11"></a>

> **`Your Model is as good as your Data`**. 

* Our dataset for training our `Story Generation` model will consist of two things - 
    1. **Prompts** with context as `Labels`. 
    2. **Stories** as `Training Data`. 
* Through this data, our model will learn the `relation` between `Prompts` and `Generated Stories`. 
* Thus, will our model be able to `generate` Better `stories` based on `User Inputs`. 

'''

#### **Process** 

1. **Collecting Stories** 
2. **Creating Prompts** 
3. **Data Preprocessing** 

'''

#### **1. Collecting Stories** 

* `Books Dataset` - [books3](https://github.com/soskek/bookcorpus/issues/27#issuecomment-716104208) (37 GB)

'''

####  **`Wattpad Dataset`** 

* The `Wattpad Dataset` is a collection of over `600,000 stories` from the popular online platform `Wattpad`. 

> Wattpad is a website and app where users can create and share their own stories in various `categories`, such as fanfiction, horror, humor, and adventure. Wattpad has over `90 million monthly users` and hosts stories in over `50 languages`.

* The `Wattpad Dataset` was created by researchers from the University of Padova, Italy, who scraped the `metadata and text of the stories` from the Wattpad website in 2018. The dataset contains information such as the `title, author, genre, tags, description, votes, comments, reads, and parts of each story`. The dataset also includes the network structure of the authors and readers, such as who follows whom and who votes for which stories.

* The `Wattpad Dataset` can be used for various `purposes`, such as analyzing the characteristics and trends of online storytelling, exploring the social dynamics and interactions of the Wattpad community, and developing natural language processing and `text generation models` based on the stories. The dataset is publicly available on GitHub, where you can also find the scripts and tools used to create and process it. 


In [None]:
# Find as many stories as possible under 10000 words. (~100000 stories) 
# Collect them in a .csv file with metadata. 
# Generate other data. (~such as those in prompts) 

'''

#### **2. Creating Prompts**

* 

'''

**Notes:-** 
* When generating prompts for the dataset, leave some of the inputs empty strings. This is to simulate the real life scenario of User using the app as intended. 
* Set a limit on the number of characters that we take from each input. 

'''

#### **3. Data Preprocessing** <a id="13"></a> 

*

***