# **Dreamweaver AI**
The goal of this project is to create the use of LLM for coherent story generation.

The plan is to use a base model to tune it to the use case. However, we did not have enough time and resources for training, therefore, we did just simple analysis on our rugged approach.

## Setup

Nothing fancy here, just your typical imports, and prerequisites.

In [None]:
!pip install OpenAI tiktoken
!python -m spacy download en_core_web_md

Collecting en-core-web-md==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_md-3.7.1/en_core_web_md-3.7.1-py3-none-any.whl (42.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.8/42.8 MB[0m [31m33.4 MB/s[0m eta [36m0:00:00[0m
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_md')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.


In [None]:
import os
import logging
from typing import List, Dict, Optional #Conradium
import tiktoken
from openai import OpenAI
import numpy as np
import spacy
import nltk
nltk.download('punkt_tab')
from nltk.translate.bleu_score import sentence_bleu
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd
from google.colab import userdata


[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!


In [None]:
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

## Classes
Collection of classes, for story generation and evaluation

### Story Generator
Here I use OpenAI API for the model mistral 7b instruct.
Instead of hosting a model from hugging face straight in colab, I used a readily available base model.

Their API is free to use.

In [None]:
class StoryGenerator:
    def __init__(
        self,
        model: str = "mistralai/mistral-7b-instruct:free",
        max_tokens: int = 4000,
        iterations: int = 2,
        temperature: float = 0.7,
        top_p: float = 0.9
    ):

        # API
        self.client = OpenAI(
            base_url='https://openrouter.ai/api/v1',
            api_key=userdata.get('openSecret')
        )

        self.model = model
        self.max_tokens = max_tokens
        self.iterations = iterations
        self.temperature = temperature
        self.top_p = top_p
        self.tokenizer = tiktoken.get_encoding("cl100k_base") #Conradium
        self.story_memory: List[Dict] = []

    def _generate_completion(
        self,
        prompt: str,
        system_message: Optional[str] = None
    ) -> str:
        """Generate text completion with error handling"""
        try:
            messages = []

            if system_message:
                messages.append({
                    "role": "system",
                    "content": system_message
                })

            messages.append({
                "role": "user",
                "content": prompt
            })

            completion = self.client.chat.completions.create(
                model=self.model,
                messages=messages,
                max_tokens=self.max_tokens,
                temperature=self.temperature,
                top_p=self.top_p
            )

            generated_text = completion.choices[0].message.content

            # Update story memory
            self.story_memory.append({
                "prompt": prompt,
                "response": generated_text
            })

            return generated_text

        except Exception as e:
            logger.error(f"Error generating completion: {e}")
            return ""

    def generate_story(
        self,
        initial_prompt: str,
        system_prompt: Optional[str] = None
    ) -> str:
        if not system_prompt:
            system_prompt = (
                "You are a consistent storyteller. "
                "Maintain narrative flow, character development, "
                "and thematic coherence throughout the story."
            )

        full_text = self._generate_completion(
            initial_prompt,
            system_message=system_prompt
        )

        # Iterative story expansion
        for _ in range(self.iterations - 1):
            # context management: Use last 2000 tokens
            tokens = self.tokenizer.encode(full_text)
            context_tokens = tokens[-2000:]
            context = self.tokenizer.decode(context_tokens)

            continuation_prompt = (
                f"Continue the narrative, maintaining tone and style. "
                f"Previous Context:\n{context}\n\n"
                "Seamlessly extend the story, ensuring narrative coherence."
            )

            continuation = self._generate_completion(
                continuation_prompt,
                system_message=system_prompt
            )

            full_text += "\n\n" + continuation

        return full_text

    def analyze_story_structure(self, story: str) -> Dict:
        analysis_prompt = (
            f"Analyze the narrative structure of the following story:\n\n{story}\n\n"
            "Provide insights on: plot progression, character arcs, "
            "themes, and potential narrative turning points."
        )

        return {
            "raw_analysis": self._generate_completion(analysis_prompt)
        }


#Conradium



I put in an unused function, analyze story structure and progression. This is for a deeper understanding how deep does the AI model understand the story.

Sometimes, what we see in the analysis is different than the generated text. We often see greater depth in the analysis than the generated text.

The reason behind might be do to the lack of reasoning when generation.

### Story Coherence Evaluator

Honestly, half of these concepts understanding are found on the internet.
It consists of natural language toolkit (nltk), BLEU and spaCy.
Basically what it does are:

NLTK:
1. Tokenization: Break down words or sentences
2. Stemming and lemmatization: Reduce words to their base root form
3. Text classification

Bleu (Bilingual Evaluation Understudy)'s role:
1. N-gram matching: Considers matches of different lengths, 1,2,3 grams etc to capture similarities level.

SpaCy:
1. Tokenization
2. Part-of-speech tagging: Identifying grammatical category
3. Named entity recognition: Identifying and classifying named entities in the text like organizations and people
4. Dependency parsing: Analyuze grammatical structure of a sentence.
5. Word embeddings: Representing words as vectors in a high-dimensional space, which helps in capturing semantic meaning.

In [None]:
class StoryCoherenceEvaluator:
    def __init__(self):
        try:
            self.nlp = spacy.load('en_core_web_md')
            nltk.download('punkt')
        except Exception as e:
            print(f"Error loading NLP models: {e}")
            raise

    def semantic_coherence(self, text):
        doc = self.nlp(text)
        doc_vector = np.mean([token.vector for token in doc if token.has_vector], axis=0)

        sentences = [sent.text for sent in doc.sents]
        sentence_vectors = [self.nlp(sent).vector for sent in sentences]

        similarities = []
        for i in range(1, len(sentence_vectors)):
            similarity = cosine_similarity(
                sentence_vectors[i-1].reshape(1, -1),
                sentence_vectors[i].reshape(1, -1)
            )[0][0]
            similarities.append(similarity)

        return {
            'avg_sentence_similarity': np.mean(similarities) if similarities else 0,
            'semantic_coherence_score': np.mean(similarities) if similarities else 0
        }

    def lexical_coherence(self, text):
        sentences = nltk.sent_tokenize(text)

        # TF-IDF vectorization
        vectorizer = TfidfVectorizer()
        tfidf_matrix = vectorizer.fit_transform(sentences)

        # Compute cosine similarities between consecutive sentences #THANK YOU CLAUDE SONNET
        similarities = []
        for i in range(1, len(sentences)):
            similarity = cosine_similarity(
                tfidf_matrix[i-1],
                tfidf_matrix[i]
            )[0][0]
            similarities.append(similarity)

        return {
            'lexical_similarity': np.mean(similarities) if similarities else 0,
            'lexical_coherence_score': np.mean(similarities) if similarities else 0
        }

    def narrative_flow_evaluation(self, text):
        sentences = nltk.sent_tokenize(text)

        # finds candidates for similarity/pattern in the output
        bleu_scores = []
        for i in range(1, len(sentences)):
            reference = [sentences[i-1].split()]
            candidate = sentences[i].split()
            bleu_score = sentence_bleu(reference, candidate)
            bleu_scores.append(bleu_score)

        return {
            'narrative_flow_score': np.mean(bleu_scores) if bleu_scores else 0,
            'narrative_progression_variance': np.std(bleu_scores) if bleu_scores else 0
        }

    def comprehensive_coherence_analysis(self, text):
        semantic_analysis = self.semantic_coherence(text)
        lexical_analysis = self.lexical_coherence(text)
        narrative_analysis = self.narrative_flow_evaluation(text)

        # scores combined with weight (reason: adjustable preference)
        coherence_score = (
            0.4 * semantic_analysis['semantic_coherence_score'] +
            0.3 * lexical_analysis['lexical_coherence_score'] +
            0.3 * narrative_analysis['narrative_flow_score']
        )

        return {
            'overall_coherence_score': coherence_score,
            'semantic_coherence': semantic_analysis,
            'lexical_coherence': lexical_analysis,
            'narrative_flow': narrative_analysis
        }

I also weighed the coherence score in this way

semantic:lexical:narrative

4:3:3

## Usage

### Story

In [None]:
story_generator = StoryGenerator()
initial_prompt = (
    "In a world where time flows differently in certain geographical regions, "
    "tell the story of a young cartographer who inherits a device capable of "
    "synchronizing time across these disparate zones."
)

story = story_generator.generate_story(initial_prompt)
print("Generated Story:\n" + "\n".join(line.strip() for line in story.split("\n") if line.strip()))

Generated Story:
Title: The Chronomap: A Tale of Time and Tides
In the quaint, cobblestone streets of the ancient city of Chronopolis, a young cartographer named Aria lived. Born to a family of scholars and navigators, Aria had spent her life mapping the intricate labyrinth of streets, studying the arcane tomes of her ancestors, and learning the secrets of the city's unique chronodynamic properties.
Chronopolis was a city unlike any other. Its location lay at the intersection of five time zones, each with its own rhythm and flow. The city's clocks ran at different speeds, and the streets were a symphony of ticking and tocking, the sounds harmonizing in a cacophony that was both enchanting and disconcerting.
Aria's father, a renowned scholar and inventor, had spent his life studying these time zones and their impact on the city's inhabitants. He had devoted his final years to the creation of a device that could synchronize the city's clocks, bringing harmony to the chaos and ensuring th

### Coherence Evaluation

In [None]:
coherence_evaluator = StoryCoherenceEvaluator()
coherence_analysis = coherence_evaluator.comprehensive_coherence_analysis(story)

print("Coherence Analysis:")

overall_coherence_df = pd.DataFrame({
    'overall_coherence_score': [coherence_analysis['overall_coherence_score']]
})


semantic_coherence_df = pd.DataFrame.from_dict(coherence_analysis['semantic_coherence'], orient='index').T
lexical_coherence_df = pd.DataFrame.from_dict(coherence_analysis['lexical_coherence'], orient='index').T
narrative_flow_df = pd.DataFrame.from_dict(coherence_analysis['narrative_flow'], orient='index').T

print("Overall Coherence Score:\n" + overall_coherence_df.to_markdown(index=False))
print("\nSemantic Coherence:\n" + semantic_coherence_df.to_markdown(index=False))
print("\nLexical Coherence:\n" + lexical_coherence_df.to_markdown(index=False))
print("\nNarrative Flow:\n" + narrative_flow_df.to_markdown(index=False))

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


Coherence Analysis:
Overall Coherence Score:
|   overall_coherence_score |
|--------------------------:|
|                  0.364913 |

Semantic Coherence:
|   avg_sentence_similarity |   semantic_coherence_score |
|--------------------------:|---------------------------:|
|                   0.83585 |                    0.83585 |

Lexical Coherence:
|   lexical_similarity |   lexical_coherence_score |
|---------------------:|--------------------------:|
|            0.0997712 |                 0.0997712 |

Narrative Flow:
|   narrative_flow_score |   narrative_progression_variance |
|-----------------------:|---------------------------------:|
|             0.00213948 |                        0.0167099 |


The hypothesis contains 0 counts of 3-gram overlaps.
Therefore the BLEU score evaluates to 0, independently of
how many N-gram overlaps of lower order it contains.
Consider using lower n-gram order or use SmoothingFunction()
The hypothesis contains 0 counts of 4-gram overlaps.
Therefore the BLEU score evaluates to 0, independently of
how many N-gram overlaps of lower order it contains.
Consider using lower n-gram order or use SmoothingFunction()
The hypothesis contains 0 counts of 2-gram overlaps.
Therefore the BLEU score evaluates to 0, independently of
how many N-gram overlaps of lower order it contains.
Consider using lower n-gram order or use SmoothingFunction()
