Max Orenstein

4/25/2024

# Towards a Compromise: Balancing the Dangers and Capabilities of Automated Email Marketing Using LLMs

### Introduction

   Today's consumers crave personalization. From Mckinsey "Seventy-one percent of consumers expect companies to deliver personalized interactions. And seventy-six percent get frustrated when this doesn’t happen." <cite id="xv02a"><a href="#zotero%7C15786750%2F9K5HAI3K">(Arora et al., n.d.)</a></cite> The era of sending out one email to a long list of clients has long been over. Each customer is unique, not only across demographic, behavioral, and psychographic categories, but also who they are as people. Furthermore, there are levels to personalization. Think of it as a spectrum with "one message to all" all the way on the right, "segmentation" in the middle, and "1:1 communication" on the rights. For marketers managing vast amounts of customer data true 1:1 personalized messaging has been a pipe dream for a very long time. Until recently, marketers have relied on two main statistical techniques in order to craft personalized messages to their customer bases: segmentation and A/B testing. Cluster analysis is deployed to cluster groups of similar customers together based on data the company has collected, then an experiment is run using test subjects to determine the most effective message for each customer. While these methods are proven they have several limitations. Most important they are time consuming and resource intensive. With just eight properties and two levels A/B testing must be conducted on 256 segments! That's why machine learning and AI are posed to heavily disrupt the field of marketing. From Deloitte study in 2020 74% of global AI executives agreed that “AI will be integrated into all enterprise applications within three years.” <cite id="pkvu2"><a href="#zotero%7C15786750%2F6QS5RKIC">(<i>Becoming an AI-Fueled Organization, State of AI in the Enterprise, 4th Edition</i>, n.d.)</a></cite> Imagine a machine learning algorithm that can predict the churn, clv, best incentive, and upsale propensity of a new customer and then feed that and all sorts of other demographic information into an LLM which crafts a carefully worded but persuasive marketing email based on that data. That is the true potential of AI in marketing. Indeed, marketers who have traditionally relied on statistical methods like cluster analysis and A/B testing are seeing the potential use cases for AI in this context. However integrating this technology has proved a problem: LLMs often behave strangely, hallucinating and becoming misaligned with brand image. Thus, marketers looking to leverage the transformative power of LLMs must take an integrated holistic approach that ensures safe and effective use of this technology. This essay will discuss the challenges marketers face implementing these methods and what they can do to get the gains without the pains.

!["Personalization drives performance and better customer outcomes. Companies that grow faster drive 40 percent more of their revenue from personalization than their slower-growing counterparts."<cite id="xv02a"><a href="#zotero%7C15786750%2F9K5HAI3K"> (Arora et al., n.d.)</a></cite>](attachment:70957246-e078-46ee-96d0-3fad32f9dcaa.png)

### Background

In order to understand the benefits and limitation of AI in marketing it's important to ground some terminology. AI is an umbrella term that describes a field of computer science research concerned with automating intelligent behavior. Within the world of AI are many subcategories but the one's I'm most concerned with here are machine learning (ML) algorithms and large language models (LLMs.) Machine learning uses computer programs that process huge amounts of data and learn recognize patterns via an algorithm.  Marketers use machine learning to monitor consumer behavior. Develop algorithms for discovering websites visited, open emails, downloads, clicks, etc. They can also analyze how the user behaves across channels, which accounts they follow, posts they like, ads they interact with, etc. <cite id="gv12u"><a href="#zotero%7C15786750%2F2CQPP3Y3">(Ribeiro &#38; Reis, 2020)</a></cite> LLMs like OpenAI's ChatGPT use neural networks and deep learning algorithms to uncover patterns in language based on huge amounts of unstructured data and are able to generate large amounts of coherent text based on predicting the next most likely word in a sentence. probability distributions across language. Marketers can use machine learning methods for insight extraction from structured or unstructured data, which can be applied to content generation. This allows content to change as the consumer changes in robust ways. The traditional process of content creation, online experimentation, and feedback collection, while reliable, doesn't provide insights for future content improvement. <cite id="vixaj"><a href="#zotero%7C15786750%2FSWBWUGRN">(Ma &#38; Sun, 2020)</a></cite> Machine learning differs from statistics because statistics is about finding relationships between variables and the significance of those relationships while machine learning is more concerned with predictive power. Likewise, machine learning models provide various degrees of interpretability, but they generally sacrifice interpretability for predictive power. <cite id="rjfm1"><a href="#zotero%7C15786750%2F8UHIQ6H6">(Stewart, 2020)</a></cite> Thus machine learning should be preferred for tasks where predictive accuracy is more important while statistical methods should be deployed when interpretability is more important.

![](attachment:cb3aeee4-ad31-41c8-8a61-3a487d8ae409.png)

The AI revolution in marketing isn't coming, it's already here. Already ML algorithms are deployed to help organizations predict enhance key metrics and make strategic corrections from structured and unstructured data. One example is propensity models which are used to provide consumer behavior predictions. These models forecast the likelihood of a specific sort of customer purchasing behavior, such as whether a visitor to your website would make a purchase or not. <cite id="mlg9q"><a href="#zotero%7C15786750%2FYNG3A3Q2">(Gupta &#38; Joshi, 2022)</a></cite> At Salesforce, the Sales Cloud Einstein suite has several capabilities, including an AI-based lead-scoring system that automatically ranks B2B customer leads by the likelihood of purchase. <cite id="e9vcj"><a href="#zotero%7C15786750%2FDRTWS9PK">(Davenport et al., 2021)</a></cite> Replacing cluster analysis is automated segmentation, the practice of grouping consumers into cohorts that meet a set of criteria using machine learning to offer suggestions throughout the system, in addition to demographic and transactional data. North Face, using IBM Watson, uses AI to determine which jackets consumers may be interested in, based on available data.The system begins by asking where, when, and what activities the consumer will be wearing the jacket and based on the weather forecast for that location and the wearer’s gender, narrows the search to six options. Based on activity, rearranges alternatives from “high match” to “low match”. This will save the wearer time by avoiding hundreds of jacket options, many of which would not even meet your functional needs. This is a way to increase the quality of the customer experience throughout its decision-making journey. <cite id="gv12u"><a href="#zotero%7C15786750%2F2CQPP3Y3">(Ribeiro &#38; Reis, 2020)</a></cite> Helpful AI agents powered by machine learning algorithms are guiding customers through their purchase journeys through search results, website customization, and chatbots. <cite id="vixaj"><a href="#zotero%7C15786750%2FSWBWUGRN">(Ma &#38; Sun, 2020)</a></cite> And AI has even allowed companies to collect new types of data such as Procter & Gamble’s Olay Skin Advisor, which uses deep learning to analyze selfies that customers have taken, assess their age and skin type, and recommend appropriate products. It is integrated into an e-commerce and loyalty platform, Olay.com, and has improved conversion rates, bounce rates, and average basket sizes in some geographies. <cite id="e9vcj"><a href="#zotero%7C15786750%2FDRTWS9PK">(Davenport et al., 2021)</a></cite> What's new here is LLMs. The ability of LLMs to personalize messages to indivudual customers using predictions from ML has the potential to transform all these insights into action.

![](attachment:0749417c-84d0-4399-801e-7a2914c7aa46.png)

Marketers stand to be one of the largest beneficiaries of the AI revolution. A 2020 Deloitte global survey of early AI adopters showed that three of the top five AI objectives were marketing-oriented: enhancing existing products and services, creating new products and services, and enhancing relationships with customers. <cite id="pkvu2"><a href="#zotero%7C15786750%2F6QS5RKIC">(<i>Becoming an AI-Fueled Organization, State of AI in the Enterprise, 4th Edition</i>, n.d.)</a></cite> As we've seen AI has already been adopted in various forms across all sorts of industries, and as new methodologies come about old techniques will be disrupted. The greatest potential for AI in marketing is use cases in which more established techniques can already be use but where ML methods can be more effective. According to a 2018 Mckinsey report *Notes from the AI Frontier Insights from Hundreds of Use Cases*, 69 percent of the AI use cases identified in the study were improvements on old techniques while in only 16 percent of use cases there were “greenfield” AI solutions that were applicable where other analytics methods would not be effective. <cite id="dbtrl"><a href="#zotero%7C15786750%2FCF9AH7B2">(Chiu et al., 2018)</a></cite> Digital marketing specifically is well positioned to adapt to AI. E-commerce platforms sit on vast amounts of customer data gathered from various parts us the web. Click data and time spent on web pages can be measure very precisely and then algorithms can be designed to customize promotions, pricing, and products for each customer dynamically and in real time. <cite id="dbtrl"><a href="#zotero%7C15786750%2FCF9AH7B2">(Chiu et al., 2018)</a></cite> Already in email marketing and web design LLMs can use this information to adjust their parameters in order to deliver the hyper-personalization that customers so crave.
    
![We believe that marketers will ultimately see the greatest value by pursuing integrated machine-learning applications, though simple rule-based and task-automation systems can enhance highly structured processes and offer reasonable potential for commercial returns. ](attachment:ec0ce87a-b220-472b-8cd0-496964219d52.PNG)

### Project

In this project I integrated Klaviyo (an email marketing automation platform) with a Shopify development store using randomly generated customers. I then scraped the customer data using the Klaviyo API and used LangChain (a framework for developing LLM applications) to generate emails based on that data. Using this approach I was able to generate hyper-personalized messages for each customer in a couple of minutes. This methodology has benefits and drawbacks, and is by no means the only approach to personalization using LLMs. The most obvious benefit is the ease and time saved. To write fourteen highly personalized emails for each customer would have taken several hours of not very stimulating work. AI can reduce repetitive task costs and direct marketers to tasks that are more about creativity, strategy, and decision making. <cite id="gv12u"><a href="#zotero%7C15786750%2F2CQPP3Y3">(Ribeiro &#38; Reis, 2020)</a></cite> There's also the financial component. While API calls can get expensive for most marketers, AI does not change the level of marketing spend. <cite id="gv12u"><a href="#zotero%7C15786750%2F2CQPP3Y3">(Ribeiro &#38; Reis, 2020)</a></cite> This means chatbots can deliver quicker more personalized message at similar cost to current methods. But these systems do not come without challenges. 

![The emails generating in my project](attachment:7ad3fc0c-9a9c-424f-b74c-2d8b36cc4f8f.png)

Hallucinations are an issue. The output of an LLM can range from misleading to patently false. The core problem is that LLMs are probabilistic predictors not fact-based. LLMs will output anything that fits the context, regardless of the underlying truth. <cite id="69189"><a href="#zotero%7C15786750%2FDQWABHFD">(<i>The Definitive Guide to Large Language Models and High-Performance Marketing Content</i>, n.d.)</a></cite> It's easy to see how this can become problematic in this context. A brands messaging must be carefully constructed to fit its particular image, stay factual, and not come off as wierd or creepy. In recent years, according to respondents, marketers have wondered how marketing can deliver value without being too intrusive (externally) and how marketing can reshape and empower people within companies (internally) to work in this logic <cite id="gv12u"><a href="#zotero%7C15786750%2F2CQPP3Y3">(Ribeiro &#38; Reis, 2020)</a></cite> Moreover these issues are not easy to fix. Unlike traditional software LLMs don't have a bug that can be fixed in their code since they are probabilistic models. Methods such as careful prompt-engineering and fine-tuning can help if correctly executed but aren't a perfect solution. While fine-tuning can significantly improve classification task accuracy when specific domain expertise is required, the process is not as straightforward or effective as one might hope for text-generation applications. As with all machine learning applications, the fine-tuned model’s performance will decrease if the training data is not properly structured, comprehensive, and of high quality. <cite id="69189"><a href="#zotero%7C15786750%2FDQWABHFD">(<i>The Definitive Guide to Large Language Models and High-Performance Marketing Content</i>, n.d.)</a></cite> I observed all of these issues throughout the project. The difficulty is striking the balance between telling the LLM to be creative and appeal to the interests of the specific customer while building in some sort of safety net to ensure it's not hallucinating.

![We used GPT-3.5 Turbo, which, at the time of writing, is the best model from OpenAI that supports fine-tuning. The results showed that this is not a suitable approach to the problem. Out of 100 generations, more than 20 contained obvious factual error <cite id="69189"><a href="#zotero%7C15786750%2FDQWABHFD">(<i>The Definitive Guide to Large Language Models and High-Performance Marketing Content</i>, n.d.)</a></cite>](attachment:7d6c0916-6438-4cfb-a4bd-6eddf9159d24.PNG)

In his book *Co-Intelligence: Living and Working with AI* Ethan Mollick presents four principles of using AI: 1. Always bring AI to the table 2. Be the human in the loop 3. Treat AI like a person 4. Assume this is the worst AI you'll ever use <cite id="ymmye"><a href="#zotero%7C15786750%2FIF4CA458">(Mollick, 2024)</a></cite> Firms looking to integrate LLMs for automated marketing should start by following the first two principles. They should try to integrate AI wherever they can but monitor its performance closely and evaluate it in terms of opportunity cost against other techniques. Likely, the solution for most firms is some sort of feedback loop where results get himan feedback and are passed back through the system in order to improve it. <cite id="69189"><a href="#zotero%7C15786750%2FDQWABHFD">(<i>The Definitive Guide to Large Language Models and High-Performance Marketing Content</i>, n.d.)</a></cite> When it comes to brand voice and hallucination, it is best to use NLP technology to post-process your data and catch problems. Using a rule-based solution will be more reliable and give you more control than attempting to force an LLM to output perfect content every time. <cite id="69189"><a href="#zotero%7C15786750%2FDQWABHFD">(<i>The Definitive Guide to Large Language Models and High-Performance Marketing Content</i>, n.d.)</a></cite> There are also other neural-network-based systems that can score and extracts insights from a marketing content design. One such example is this paper in which multimodal neural network predicts the attractiveness of marketing contents, and a post-hoc attribution method generates actionable insights for marketers to improve their content in specific marketing location. We develop a neural-network-based system that scores and extracts insights from a marketing content design to close the loop between content creation and online experimentation. <cite id="wrus3"><a href="#zotero%7C15786750%2F7GLXHBRV">(Kong et al., 2023)</a></cite> Feeding a model this type of real-time experimentation data could reduce hallucinations. 

![](attachment:a7ad2164-f5e8-4a2a-abd8-6f9cf031957e.jpg)

### Conclusion

In summary, the rise of AI in marketing responds to the growing consumer demand for personalization, offering the potential for true 1:1 messaging at scale. However, integrating AI presents challenges such as hallucinations and brand misalignment. By embracing AI while mitigating risks, marketers can enhance customer experiences and revenue. Despite short-term challenges, Amara's Law reminds us that the long-term impact of AI in marketing is vast, heralding an era where hyper-personalization becomes standard practice. In embracing AI's potential, marketers must navigate its short-term challenges while recognizing its long-term transformative impact. Despite initial hurdles, including the need to balance creativity with control and address issues like hallucinations, the strategic integration of AI promises to revolutionize marketing practices. By adhering to principles such as being the human in the loop and treating AI as a partner, marketers can harness its capabilities to drive innovation, enhance customer engagement, and stay ahead in an increasingly competitive landscape. Thus, while Amara's Law cautions against overestimating short-term impacts, it also underscores the importance of anticipating and leveraging AI's profound long-term implications in shaping the future of marketing.

![](attachment:b9e5d6b7-11aa-436a-ac31-debfceb78b55.png)

### Bibliography

<!-- BIBLIOGRAPHY START -->
<div class="csl-bib-body">
</div>
<!-- BIBLIOGRAPHY END -->