# QuantPy Twitter Bot

This is a jupyter notebook, to guid through the process of prompt engineering and storing the data within python dataclasses.

## Prompt Engineering

Langchain prompting template adapted from [https://github.com/gkamradt](https://github.com/gkamradt/langchain-tutorials/blob/36957e9be70c09dcadaefb2caf790111170dd132/bots/Twitter_Reply_Bot/Twitter%20Reply%20Bot%20Notebook.ipynb)

More on prompt engineering and langchain can be found [here](https://python.langchain.com/docs/modules/model_io/prompts/prompt_templates/prompts_pipelining)

<b>Note on "database" structure: </b>
 - although not perfect, our database for the moment will be regular text files separated by '|' (divisors) with each row being a new entry.

In [1]:
import os
import re
from dotenv import load_dotenv

from langchain.chat_models import ChatOpenAI
from langchain.prompts import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    AIMessagePromptTemplate,
    HumanMessagePromptTemplate,
)

## Setting up variables

In [3]:
load_dotenv()

# Secret keys from .env file
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# Director Variables
CUR_DIR = os.path.dirname(os.path.abspath('__file__'))
APP_DIR = os.path.abspath(os.path.join(CUR_DIR, os.pardir))
LOG_FILE = os.path.join(APP_DIR, "twitter-bot.log")
RAW_TEXT_FILE = os.path.join(APP_DIR, "data/raw/content-ideas.txt")
TEXT_FILE = os.path.join(APP_DIR, "data/processed/quants_tweets.txt")

# print variables
print(f"Application Dir: \n\t{APP_DIR}")
print(f"Raw Tweet File Dir: \n\t{RAW_TEXT_FILE}")
print(f"Processed Tweet File Dir: \n\t{TEXT_FILE}")

Application Dir: 
	/Users/jonathonemerick/Documents/dev/quantpy-twitter-bot
Raw Tweet File Dir: 
	/Users/jonathonemerick/Documents/dev/quantpy-twitter-bot/data/raw/content-ideas-2.txt
Processed Tweet File Dir: 
	/Users/jonathonemerick/Documents/dev/quantpy-twitter-bot/data/processed/quants_tweets_2.txt


## Prompt Engineering with ChatGPT

In [21]:
llm = ChatOpenAI(temperature=0.3,
                 openai_api_key=OPENAI_API_KEY,
                 model_name='gpt-3.5-turbo-0613',
                )

In [22]:
def generate_response(
    llm: ChatOpenAI, quant_topic: str, quant_title: str
) -> tuple[str, str]:
    """Generate AI Twitter Content for QuantPy Twitter Account

    Parameters:
        - llm:  pre-trained ChatOpenAi large language model
        - quant_topic: Topic in Quant Finance
        - quant_topic: Topic in Quant Finance

    Returns:
        - tuple[long response,short reposonse]: Chat GPT long and short responses
    """
    # System Template for LLM to follow
    system_template = """
        You are an incredibly wise and smart quantitative analyst that lives and breathes the world of quantitative finance.
        Your goal is to writing short-form content for twitter given a `topic` in the area of quantitative finance and a `title` from the user.
        
        % RESPONSE TONE:

        - Your response should be given in an active voice and be opinionated
        - Your tone should be serious w/ a hint of wit and sarcasm
        
        % RESPONSE FORMAT:
        
        - Be extremely clear and concise
        - Respond with phrases no longer than two sentences
        - Do not respond with emojis
        
        % RESPONSE CONTENT:

        - Include specific examples of where this is used in the quantitative finance space
        - If you don't have an answer, say, "Sorry, I'll have to ask the Quant Finance Gods!"    

        % RESPONSE TEMPLATE:

        - Here is the response structure: 
            Hook: Captivate with a one-liner.
            Intro: Briefly introduce the topic.
            Explanation: Simplify the core idea.
            Application: Note real-world relevance.
            Closing: Reflective one-liner.
            Action: Short engagement call.
            Engagement: Quick question.
    
    """
    # system prompt template to follow
    system_message_prompt = SystemMessagePromptTemplate.from_template(system_template)

    # human template for input
    human_template = "topic to write about is {topic}, and the title will be {title}. Keep the total response under 200 words total!"
    human_message_prompt = HumanMessagePromptTemplate.from_template(
        human_template, input_variables=["topic", "title"]
    )

    # chat prompt template construction
    chat_prompt = ChatPromptTemplate.from_messages(
        [system_message_prompt, human_message_prompt]
    )

    # get a completed chat using formatted template with topic and title
    final_prompt = chat_prompt.format_prompt(
        topic=quant_topic, title=quant_title
    ).to_messages()

    # pass template through llm and extract content attribute
    first_response = llm(final_prompt).content

    # construct AI template, to pass back OpenAI response
    ai_message_prompt = AIMessagePromptTemplate.from_template(first_response)

    # additional prompt to remind ChatGPT of length requirement
    reminder_template = "This was good, but way too long, please make your response much more concise and much shorter! Make phrases no longer than 15 words in total. Please maintain the existing template."
    reminder_prompt = HumanMessagePromptTemplate.from_template(reminder_template)

    # chat prompt template construction with additional AI response and length reminder
    chat_prompt2 = ChatPromptTemplate.from_messages(
        [system_message_prompt, human_template, ai_message_prompt, reminder_prompt]
    )

    # get a completed chat using formatted template with topic and title
    final_prompt = chat_prompt2.format_prompt(
        topic=quant_topic, title=quant_title
    ).to_messages()

    # pass template through llm and extract content attribute
    short_response = llm(final_prompt).content

    return first_response, short_response

## Run generate response

In [14]:
first_response, short_response = generate_response(
    llm, 
    quant_topic='Time Value of Money', 
    quant_title='Unveiling the Magic of Compounding: Time Value of Money'
    )

In [35]:
# helper lambda functions to quickly cound dict values
count_length = lambda d: sum(len(d[val]) for val in d)
count_words = lambda d: sum(len(re.findall(r'\w+', d[val])) for val in d)

# Extract OpenAI response from given list of key words from template in system prompt Template
key_list=["Hook", "Intro", "Explanation", "Application", "Closing", "Action", "Engagement"]
def extract_tweet(openai_tweet: str, key_list: list) -> dict:
    """Creates dictionary from Openai response using keyword template

    Parameters:
        - openai_tweet: 
        - key_list: list key words used for searching reponse template

    Returns:
        - dictionary: templated tweet
    """
    template = {}
    # Iterate through key list
    for i, key in enumerate(key_list):
        # find starting position
        start = openai_tweet.find(key_list[i])+len(key_list[i])+2
        if i != len(key_list) - 1:
            # using ending position, subset str and append to template
            end = openai_tweet.find(key_list[i+1])
            line = openai_tweet[start:end]
            template[key_list[i]] = line
        else:
            # if final word in list, only subsection by start word
            template[key_list[i]] = openai_tweet[start:]
    return template

## Long Response

In [36]:
extract_tweet(first_response, key_list)

{'Hook': 'Unveiling the Magic of Compounding: Time Value of Money - Unlocking the Secrets to Financial Growth!\n\n',
 'Intro': 'The Time Value of Money is the financial concept that reveals the true power of compounding, allowing your money to work for you over time.\n\n',
 'Explanation': 'At its core, the Time Value of Money recognizes that a dollar today is worth more than a dollar in the future due to the potential to earn interest or returns. By understanding this concept, you can make informed decisions about investing, borrowing, and saving.\n\n',
 'Application': 'This concept is the backbone of many financial calculations, such as calculating the present value of future cash flows, determining loan payments, and evaluating investment opportunities. It helps investors assess the profitability of their investments and guides individuals in making smart financial choices.\n\n',
 'Closing': 'Embracing the Time Value of Money is like having a superpower in the world of finance. It em

## Short Response

In [37]:
extract_tweet(short_response, key_list)

{'Hook': 'Unveiling the Magic of Compounding: Time Value of Money - Financial Growth Unlocked!\n\n',
 'Intro': "Time Value of Money: Your money's potential to grow over time.\n\n",
 'Explanation': 'A dollar today is worth more than a dollar in the future due to compounding.\n\n',
 'Application': 'Guides financial decisions, like investments, loans, and savings, for long-term wealth accumulation.\n\n',
 'Closing': 'Time Value of Money is your superpower in finance, empowering strategic decision-making.\n\n',
 'Action': 'Invest early and consistently to harness the power of compounding.\n\n',
 'Engagement': 'How can you creatively apply Time Value of Money in your personal finance journey? #TimeValueOfMoney #Compounding'}

## Preprocessing content-ideas from text file

Take text file in table structure from GPT and process it by line divisors (|).
Then create dicitonary and store in desired format back to text file.

An example of the table structure is below.

| Number | Topic                          | Title                                                         |
| --- | --- | --- |
| 1      | Time Value of Money            | "Unveiling the Magic of Compounding: Time Value of Money"    |
| 2      | Risk and Return                | "Playing the Odds: Understanding Risk and Return"            |
| 3      | Modern Portfolio Theory        | "Crafting the Perfect Portfolio: An Intro to MPT"            |


In [5]:
file = open(RAW_TEXT_FILE, "r")
quant_tweets = {}

for line_no, line in enumerate(file.readlines()):
    # start 2nd row to avoid heading and underlines
    if line_no > 1:
        # split on line divisors
        items = line.split('|')
        # capture and ensure int and str formatting
        tweet_no = int(items[1])+100
        quant_topic = items[2].strip()
        quant_title = items[3].strip().strip('"')
        # store within dict
        quant_tweets[tweet_no] = {}
        quant_tweets[tweet_no]['topic'] = quant_topic
        quant_tweets[tweet_no]['title'] = quant_title
        # print tweet no, topic and title
        print(f"{tweet_no}_{quant_topic}_{quant_title}")
file.close()

# storing in desired format
# could directly place processed data location here: TEXT_FILE instead of 'quants_tweets.txt'
with open('quants_tweets.txt', 'w') as f:
    for tweet_no, tweet_info in quant_tweets.items():
        #Attaching 3 flags at the end which are currently all False
        tweet_repr = str(tweet_no)+'|'+tweet_info['topic']+'|'+tweet_info['title']+'|FALSE|FALSE|FALSE|\n'
        f.write(tweet_repr)

101_Time Value of Money_Understanding the Time Value of Money in Quant Finance
102_Risk and Return_Quantifying Risk and Return
103_Modern Portfolio Theory_Mastering Modern Portfolio Theory
104_Black-Scholes Model_Black-Scholes Model: The ABCs of Option Pricing
105_Multifactor Models_Exploring Multifactor Models in Finance
106_Copula Models_Copula Models: The Backbone of Financial Engineering
107_Stochastic Calculus_Stochastic Calculus for Finance: A Primer
108_Ito’s Lemma_Ito's Lemma Explained
109_Quantitative Risk Management_Techniques in Quantitative Risk Management
110_Simple Moving Average_Building a Simple Moving Average Trading Strategy
111_Backtesting_Introduction to Backtesting and its Pitfalls
112_Transaction Costs_Incorporating Transaction Costs in Trading Models
113_Slippage_Understanding and Accounting for Slippage
114_Portfolio Optimization_Portfolio Optimization Techniques
115_QuantLib_Introduction to QuantLib
116_Zipline_Getting Started with Zipline
117_Backtrader_Algori

## Twitter File

We will use a dataclasses to:
1. store Tweets,
2. track Track, and
3. manage a TweetQueue

In [44]:
from enum import Flag, auto
from json import dumps, loads
from dataclasses import dataclass, asdict, field


class Boolean(Flag):
    TRUE = True
    FALSE = False


class TweetType(Flag):
    SINGLE = auto()
    THREAD = auto()


@dataclass
class Tweet:
    Hook: str
    Intro: str
    Explanation: str
    Application: str
    Closing: str
    Action: str
    Engagement: str

    @classmethod
    def from_dict(cls, tweet_d: dict):
        # return class
        return cls(
            Hook=tweet_d["Hook"],
            Intro=tweet_d["Intro"],
            Explanation=tweet_d["Explanation"],
            Application=tweet_d["Application"],
            Closing=tweet_d["Closing"],
            Action=tweet_d["Action"],
            Engagement=tweet_d["Engagement"],
        )

    def to_text(self):
        _spaced_response = f"{self.Hook}\n{self.Intro}\n{self.Explanation}\n{self.Application}\n{self.Closing}\n{self.Action}\n{self.Engagement}"
        if len(_spaced_response) > 280:
            return f"{self.Hook}{self.Intro}{self.Explanation}{self.Application}{self.Closing}{self.Action}{self.Engagement}"
        else:
            return _spaced_response


@dataclass
class TrackTweet:
    """Class for keeping track of Tweets"""

    id: int
    topic: str
    title: str
    sent_status: Boolean = Boolean.FALSE
    gen_status: Boolean = Boolean.FALSE
    tweet: Tweet = field(init=False, repr=False)

    def __lt__(self, other):
        return (self.sent_status.value, self.id) < (other.sent_status.value, other.id)

    @classmethod
    def from_str(cls, tweet_line: str):
        # underscores used to indicate unpacked variables, only used internally
        (
            _id,
            _topic,
            _title,
            _sent_status,
            _gen_status,
            _tweet,
            _next_line,
        ) = tweet_line.split("|")
        # convert status TRUE/FALSE to Enum Representation
        _sent_status_bool = (
            Boolean.TRUE if _sent_status == Boolean.TRUE.name else Boolean.FALSE
        )
        # confirm if tweet already written or not, if so load previously written tweet
        _gen_status_bool = (
            Boolean.TRUE if _gen_status == Boolean.TRUE.name else Boolean.FALSE
        )
        # init class without tweet
        _trackTweet = cls(
            id=int(_id),
            topic=_topic,
            title=_title,
            sent_status=_sent_status_bool,
            gen_status=_gen_status_bool,
        )

        if _gen_status_bool:
            # return class with written tweet
            _trackTweet.tweet = Tweet.from_dict(loads(_tweet))

        return _trackTweet

    def to_str(self):
        _part_1 = f"{self.id}|{self.topic}|{self.title}|{self.sent_status.name}|{self.gen_status.name}|"
        _part_2 = (
            f"{dumps(asdict(self.tweet)) if hasattr(self, 'tweet') else 'FALSE'}|\n"
        )
        return _part_1 + _part_2

    def update_status(self, new_status: Boolean):
        self.sent_status = new_status


@dataclass
class TweetQueue:
    tweets: list[TrackTweet] = field(default_factory=list)

    def __len__(self):
        return len(self.tweets)

    def __iter__(self):
        yield from self.tweets

    @property
    def tweets_not_sent(self):
        return [tweet for tweet in self.tweets if not tweet.sent_status]

    @property
    def tweets_not_generated(self):
        return [tweet for tweet in self.tweets if not tweet.gen_status]

    @property
    def tweets_ready_for_sending(self):
        return [
            tweet for tweet in self.tweets if tweet.gen_status and not tweet.sent_status
        ]

    def enqueue(self, tweet):
        # print(f"{tweet.to_str()} will be added.")
        self.tweets.append(tweet)

    def dequeue(self):
        # print(f"{self.tweets[0].to_str()} will be removed.")
        return self.tweets.popleft()

    @classmethod
    def from_text_file(cls, text_file):
        _tweets = cls()
        for tweet_line in open(text_file, "r"):
            tweet = TrackTweet.from_str(tweet_line)
            _tweets.enqueue(tweet)
        return _tweets

    def to_text_file(self, text_file):
        with open(text_file, "w") as f:
            for tweet in self.tweets:
                tweet_line = tweet.to_str()
                f.write(tweet_line)


## Use Tweet Classes

Let's read in tweet Queue from processed data file.

We will use the iterator we contructed to immediately loop through tweets class attribute

In [48]:
tweetQueue = TweetQueue.from_text_file(TEXT_FILE)
for tweet in tweetQueue:
    print(tweet)
    break

TrackTweet(id=1, topic='Time Value of Money', title='Unveiling the Magic of Compounding: Time Value of Money', sent_status=<Boolean.TRUE: True>, gen_status=<Boolean.TRUE: True>)


## Generate Tweets from Data

Putting it all togetehr, let's use the property of our TweetQueue to get latest non generated tweet.

Let's pass the topic and title to our generate response function and extract the tweet to dictionary

In [50]:
quant_tweet_idea = tweetQueue.tweets_not_generated[0]
first_response, short_response = generate_response(llm, quant_topic=quant_tweet_idea.topic, quant_title=quant_tweet_idea.title)
first_draft = extract_tweet(first_response, key_list)
first_draft

{'Hook': '"Order Book Dynamics: Where the battle of buyers and sellers unfolds, revealing the true market sentiment."\n\n',
 'Intro': 'Order Book Dynamics refers to the continuous interplay between buy and sell orders in a financial market, providing valuable insights into market depth and liquidity.\n\n',
 'Explanation': 'The order book is a real-time record of all pending buy and sell orders for a particular asset. It displays the quantity and price at which market participants are willing to buy or sell. As orders are executed, the order book dynamically adjusts, reflecting changes in supply and demand.\n\n',
 'Application': 'Understanding order book dynamics is crucial for market participants, including traders, market makers, and algorithmic trading systems. By analyzing the order book, traders can identify support and resistance levels, gauge market sentiment, and make informed trading decisions. Market makers utilize order book dynamics to provide liquidity and manage their inve

We can then pass the dictionary to create a Tweet Object

In [52]:
Tweet.from_dict(first_draft)

Tweet(Hook='"Order Book Dynamics: Where the battle of buyers and sellers unfolds, revealing the true market sentiment."\n\n', Intro='Order Book Dynamics refers to the continuous interplay between buy and sell orders in a financial market, providing valuable insights into market depth and liquidity.\n\n', Explanation='The order book is a real-time record of all pending buy and sell orders for a particular asset. It displays the quantity and price at which market participants are willing to buy or sell. As orders are executed, the order book dynamically adjusts, reflecting changes in supply and demand.\n\n', Application='Understanding order book dynamics is crucial for market participants, including traders, market makers, and algorithmic trading systems. By analyzing the order book, traders can identify support and resistance levels, gauge market sentiment, and make informed trading decisions. Market makers utilize order book dynamics to provide liquidity and manage their inventory. Alg