# Interacting with Campaigns and Filters <a class="anchor" id="top"></a>

Now that Unicorn Post has trained models for 2 different use cases (User Personalization and Personalized Reranking), we need to integrate them into our application. Amazon Personalize can make recommendations available via an Application Programming Interface (API). In addition, Amazon Personalize includes features that allow you to easily integrate into applications and provide benefits like real time vending of recommendations based on recent application activity.

In this notebook, you will interact with campaigns and filters you created earlier in Amazon Personalize.

1. [Introduction](#intro)
1. [Interact with Recommenders](#interact-recommenders)
1. [Interact with Campaigns](#interact-campaigns)
1. [Filters](#filters)
1. [Create Filters](#create-filters)
1. [Using Filters](#using-filters)
1. [Real-time Events](#real-time)
1. [Wrap Up](#wrapup)

To run this notebook, you need to have run the previous notebooks, [`News_01_Data_Layer.ipynb`](News_01_Data_Layer.ipynb), and [`News_02_Training_Layer.ipynb`](News_02_Training_Layer.ipynb), where you created a dataset and imported interaction, item, and user metadata data into Amazon Personalize, created recommenders, solutions, and campaigns.

## Introduction <a class="anchor" id="intro"></a>
[Back to top](#top)

At this point, you should have two deployed Campaign. Once they are active, there are resources for querying the recommendations, and helper functions to digest the output into something more human-readable. 

In this Notebook we will interact with the Campaigns and get recommendations. 

We will interact with filters and send live data to Amazon Personalize to see the effect of real-time interactions on recommendations.

The following diagram shows the resources that we will create in this section. The part we are building in this notebook highlighted in blue with a dashed outline.

![Workflow](Images/03_Inference_Layer_Resources.jpg)

To get started, once again, we need to import libraries, load values from previous notebooks, and load the SDK.

In [6]:
import time
from time import sleep
import json
from datetime import datetime
import uuid
import random
import boto3
import pandas as pd

pd.set_option('max_colwidth', 3000) # allows us to see more text for our news articles

In [7]:
#retrieves previously stored variables 
%store -r 

In [8]:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

# Establish a connection to Personalize's event streaming
personalize_events = boto3.client(service_name='personalize-events')

## Interact with Campaigns <a class="anchor" id="interact-recommenders"></a>
[Back to top](#top)

Now that the models have been trained, lets have a look at the recommendations we can get for our users!

### User Personalization Model

"User Personalization " requires a user as input, and it will return the items it thinks the customer is most likely to interact with next.

The cells below will handle getting recommendations from the "User Personalization Model" and rendering the results. Let's see what the recommendations are for a user.

We will be using the `campaignArn`, the `userId`, as well as the number or results we want, `numResults`.

### Select a User

We'll just pick a random user for simplicity. Feel free to change the `user_id` below and execute the following cells with a different user to get a sense for how the recommendations change.

#### Sample User ID's
 -8845298781299428018
 -1032019229384696495
 -1130272294246983140
 344280948527967603
 -445337111692715325

In [9]:
sample_user = str(-8845298781299428018)

In [10]:
get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = workshop_userpersonalization_campaign_arn,
    userId = sample_user,
    context={
        'user_device_type': 'UnknownAgent'
    },
    metadataColumns = {"ITEMS": ["training_text"]},
    numResults = 5
)

In [11]:
print(get_recommendations_response)

{'ResponseMetadata': {'RequestId': '0612c6fc-e37b-4b54-b7e7-41cee00319f8', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Tue, 23 Jul 2024 21:42:04 GMT', 'content-type': 'application/json', 'content-length': '6349', 'connection': 'keep-alive', 'x-amzn-requestid': '0612c6fc-e37b-4b54-b7e7-41cee00319f8'}, 'RetryAttempts': 0}, 'itemList': [{'itemId': '3566197569262766169', 'score': 0.9930378, 'metadata': {'training_text': 'Google extends Gmail API for more granular email settings management Google updated the Gmail API with new endpoints to manage filters, aliases, forwarding, signatures, vacation responders, and other granular email settings. This replaces the deprecated Email Settings API. Google has extended the Gmail API with new endpoints for managing email settings like filters, forwarding addresses, IMAP/POP settings, sendas aliases, signatures, and vacation responders. Developers can now retrieve and update signatures for sendas aliases, configure forwarding to external addresses

A little hard to read - lets make a dataframe

In [12]:
recommendations_df = pd.DataFrame.from_records(get_recommendations_response['itemList'])

In [13]:
recommendations_df

Unnamed: 0,itemId,score,metadata
0,3566197569262766169,0.993038,"{'training_text': 'Google extends Gmail API for more granular email settings management Google updated the Gmail API with new endpoints to manage filters, aliases, forwarding, signatures, vacation responders, and other granular email settings. This replaces the deprecated Email Settings API. Google has extended the Gmail API with new endpoints for managing email settings like filters, forwarding addresses, IMAP/POP settings, sendas aliases, signatures, and vacation responders. Developers can now retrieve and update signatures for sendas aliases, configure forwarding to external addresses, configure sendas aliases through external providers, use HTML in vacation messages, and manipulate settings for gmail.com accounts. More settings features like mailbox delegate support will be added over time. Most settings endpoints work for any Google Apps or Gmail account, but sensitive operations like modifying aliases or forwarding are restricted to service accounts with domainwide authority. The existing Email Settings API in the Admin SDK is deprecated and will be turned down on July 7, 2017 in favor of the updated Gmail API. Google has provided a migration guide to help clients transition.'}"
1,2573252627510191315,0.000278,"{'training_text': 'Setting autoexpiry dates for Google Drive file sharing links A guide explains how to create temporary access links for Google Drive files that automatically expire after a chosen time period, using a thirdparty web app. The article explains how to set autoexpiry dates for shared Google Drive links. This allows you to share files that are only accessible for a limited time. While Google Drive allows setting expiration dates, this is only available for paid Google Workspace accounts. The article provides instructions for free Google account users to create temporary links that autoexpire after a chosen time period. The steps are: 1. Go to labnol.org/expire and authorize the web app to access your Google Drive. 2. Open the File Picker and select the file or folder to share. 3. Enter the email addresses of users to share with. Specify if they get viewer or editor access. 4. Set the time period after which access will be revoked. 5. Click \""Set Expiration\"" and the access will be automatically removed after the set date/time. The article also mentions the Google Drive Auditor addon to analyze permissions on shared files. And the autoexpiry app lists files set to expire.'}"
2,672199059798181601,0.00026,"{'training_text': 'Using Docker Hub build hooks to generate dynamic image labels Build hooks for Docker Hub automated builds allow running scripts to generate useful dynamic labels like build date and Git commit hash. This adds value missing from default automated builds. The article discusses using build hooks to generate dynamic labels for Docker images built with automated builds on Docker Hub. Automated builds are popular but don't allow running dynamic code during the build process. This makes it difficult to generate useful labels like org.labelschema.builddate and org.labelschema.vcsref. Build hooks allow running scripts that can build and tag the image. An example build hook is provided that generates these labels. The hook shows it's possible to have useful dynamic labels with automated builds. The article also provides statistics on usage of automated builds 26% of public Docker Hub images use them, accounting for 32% of pulls. Finally it demonstrates the labels on a sample image built this way on Docker Hub and MicroBadger.'}"
3,-4205346868684833897,0.000231,"{'training_text': 'Google announces new enterprise products Springboard and redesigned Sites Google launched Springboard, an AIpowered digital assistant, and a redesigned Google Sites with draganddrop editing to improve enterprise productivity. Google has announced two new enterprise products Springboard and a redesigned Google Sites. Springboard is a digital assistant that provides a unified search interface across Google services like Gmail, Drive, Docs, and Calendar. It uses AI to surface relevant information and recommendations to workers. The redesign of Google Sites now allows draganddrop editing and realtime collaboration. It also optimizes content for any screen size. These new features aim to help enterprise users be more efficient. The products are currently in early adopter programs for existing Google Apps customers. Google has hinted at more enterprise improvements to come for both services.'}"
4,5189433264884433830,0.000227,"{'training_text': 'Google Calendar introduces Goals feature to help users schedule personal activities Google Calendar's new Goals feature uses AI to analyze a user's schedule and automatically find time to schedule personal goals like working out more. It gets better at scheduling over time. Google Calendar is introducing a new feature called Goals to help users schedule time for personal activities. Users can add a goal like \""Work out more\"" and answer questions about frequency and preferred times. Calendar will then analyze the user's schedule to find windows to schedule the goal. Goals have the same privacy settings as the default calendar. Calendar can automatically reschedule goals if events conflict and users can defer goals to make time later. Calendar gets better at scheduling goals over time based on edits and deferrals. Users can set their first goal by downloading the Google Calendar app on Android or iPhone. Currently goals do not account for secondary or synced calendars. The feature is launching fully over 13 days on both Rapid and Scheduled release tracks. Google suggests change management and provides Help Center articles on Goals. The latest Calendar apps can be downloaded from Google Play and the App Store. The feature launches for all Google Apps editions unless noted.'}"


What has this user viewed previously

In [14]:
viewed_interactions = interaction_data[interaction_data['user_id'].astype(str) == sample_user].sort_values('timestamp', ascending=False)[0:11]

In [15]:
viewed_interactions

Unnamed: 0,timestamp,event_type,item_id,user_id,session_id,user_device_type
69083,1487360168,FOLLOW,3566197569262766169,-8845298781299428018,808768479044973017,NonMobile
69085,1487360167,COMMENT CREATED,3566197569262766169,-8845298781299428018,808768479044973017,NonMobile
69082,1487359849,VIEW,3566197569262766169,-8845298781299428018,808768479044973017,NonMobile
69722,1487069095,VIEW,-8900113512825364282,-8845298781299428018,8663979798581613597,NonMobile
68453,1485973216,VIEW,-532999578436827210,-8845298781299428018,-5430065457414428568,NonMobile
68450,1485973198,VIEW,-532999578436827210,-8845298781299428018,-5430065457414428568,NonMobile
66947,1485900025,FOLLOW,-532999578436827210,-8845298781299428018,-4872963759822924080,NonMobile
66946,1485900025,COMMENT CREATED,-532999578436827210,-8845298781299428018,-4872963759822924080,NonMobile
66945,1485899837,VIEW,-532999578436827210,-8845298781299428018,-4872963759822924080,NonMobile
66803,1484848565,VIEW,7419040071212162906,-8845298781299428018,-6140011183054037117,NonMobile


Lets take a closer look at the most recent articles interacted with by this user

In [24]:
most_recent_five_articles = viewed_interactions.item_id.unique().tolist()

In [25]:
most_recent_five_articles_metadata = articles_mlfeatures[articles_mlfeatures['item_id'].isin(most_recent_five_articles)]

In [26]:
most_recent_five_articles_metadata[['item_id', 'training_text']]

Unnamed: 0,item_id,training_text
80,-8900113512825364282,"Banks must prioritize digital experience and leverage data analytics to enhance customer journey The report reveals that most banks lack formal plans to improve customer experience, despite it being a top priority. They need to focus on digital channels, personalization through data, and seamless cross channel engagement. The \""Improving Customer Experience in Banking\"" report reveals that most financial institutions are unprepared for increasing consumer expectations driven by technological advancements. Despite customer experience being a top priority, many banks lack formal plans and focus more on internal benefits like cost reduction rather than enhancing the customer experience. Key findings include: Only 37% of organizations have a formal customer experience plan. Objectives center on internal gains like increased share of wallet and efficiency, not customer benefits. Digital channels are crucial for driving satisfaction, but institutions prioritize branches and products over technology. While investment in customer experience is increasing, most firms report only modest impacts so far. Major challenges involve data analytics, legacy systems, and obtaining a complete customer view. Measurement varies widely, often lacking revenue based metrics. To improve, banks must prioritize digital experience, leverage data and analytics for personalization, enable seamless cross channel engagement, and proactively guide customers throughout their journey. Delivering an exceptional customer experience requires a data driven, process oriented approach focused on continuously enhancing touchpoints that matter most to consumers."
1456,-532999578436827210,"IBM launches cloud based graph database service IBM Graph IBM Graph is a new cloud service that aims to make graph database technology more accessible, providing an interactive tutorial and reliable back end for production workloads. IBM has launched a new cloud based graph service called IBM Graph, built on open source Titan and Apache TinkerPop technologies. The service aims to make graph technology more accessible, providing an interactive tutorial playground to help developers understand the Gremlin query language and graph concepts. While simplifying the front end, IBM has also focused on making the back end reliable for production workloads. IBM Graph is suitable for both analytical and transactional workloads, with a focus on real time graph queries from other applications. It uses Cassandra for data persistence and ElasticSearch for indexing. The service is available on IBM's Bluemix cloud, with pricing based on storage and API call usage. IBM is committed to not forking the core open source technologies, allowing customers to potentially move workloads to their own servers if desired. Future plans include enhancing data modeling and ETL capabilities to further simplify graph database adoption."
2148,3566197569262766169,"Google extends Gmail API with new email settings endpoints Developers can now retrieve and update signatures, configure forwarding, manage send as aliases, and manipulate other email settings for Gmail and Google Apps accounts using the new Gmail API endpoints. Google is extending the Gmail API with new endpoints for managing email settings like filters, forwarding, IMAP/POP settings, send as aliases, signatures, and vacation responders. Developers can now retrieve and update signatures for send as aliases, configure forwarding to external addresses, configure send as aliases that send through external providers, use HTML in vacation messages, and manipulate settings for gmail.com accounts. While most settings endpoints work for any Google Apps or Gmail account, some sensitive operations require service accounts with domain wide authority. This update effectively replaces the older Email Settings API which will be deprecated on July 7, 2017. Google has provided a migration guide to help port existing integrations to the new Gmail API endpoints."
2808,7419040071212162906,"Capitalize on Google's \""current year\"" search trend for traffic and income Including the current year in titles and content can drive significant Google traffic and affiliate marketing income by targeting popular product searches. Last year, an article shared a tactic that could send websites thousands of visitors from Google by including the current year (e.g. \""2016\"") in titles and content. The tactic worked, sending one website over 33,000 visitors from Google for a single article. The core idea is that people frequently add the current year to their Google searches to get the latest results (e.g. \""best business books 2016\""). Google Trends data confirms this behavior across many topics. Several success stories validate the approach: A new site called 10beasts.com focused entirely on \""best products of 2016\"" lists. Within 4 months, it ranked #1 for hot product terms and made $4,687 from Amazon affiliates in one month. By December, it generated over $80,000 in profit. BestProducts.com, backed by Hearst Media's authority links, dominated Google for 2016 product queries and grew traffic from 2.5 million to 7.5 million visitors in 5 months. To capitalize on this for 2017, thorough keyword research is needed across tools like Google Trends, Keyword Planner, and by reverse engineering competitors' traffic. The most valuable keywords relate to purchasable products to monetize via affiliate marketing. With proper execution, significant Google traffic and income is achievable."
2997,8526042588044002101,"Cloud Native is an approach to leverage automation and architectures that manage complexity and enable velocity for teams, culture, and technology. The article introduces the concept of Cloud Native, which aims to bring benefits like efficient teams, reliable systems, better visibility, enhanced security, and efficient resource usage to companies beyond just tech giants. Cloud Native is an approach to structuring teams, culture, and technology to leverage automation and architectures that manage complexity and enable velocity. It's about scaling both the people and technology sides of the equation. Key benefits include more efficient and happier teams, reliable infrastructure and applications, better visibility and debuggability, enhanced security, and efficient resource usage. While initially proven at tech giants like Google, Netflix, and Facebook, Cloud Native techniques are now being adopted by smaller, agile companies too. However, there are still few examples outside of the technology early adopters. The authors, with their experience at Google, are excited to help bring Cloud Native benefits to the wider IT industry. Future articles will cover integrating with existing systems, DevOps, containers, orchestration, microservices, and security. The goal is to make companies, teams, and people more successful by applying these modern approaches."


We see that the user has previously red information on cloud applications and google in particular. In fact one of our recommendations is actually in the users recent interaction history. This is not ideal lets use the fitler we created earlier to exclude this data. We will put in a timestamp cutoff of 0 for now to include all articles. And we will also include all availible genres in our filter for the moment.

In [33]:
get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = workshop_userpersonalization_campaign_arn,
    userId = sample_user,
    numResults = 5,
    filterArn = genre_filter_arn,
    filterValues = {"CUTOFF": "\"0\"", "GENRELIST": "\"tech\",\"non tech\",\"cloud provider news\",\"crypto currency\""},
    metadataColumns = {"ITEMS": ["training_text"]}
)

In [34]:
genre_filter_arn

'arn:aws:personalize:us-east-1:908388459961:filter/immersion-day-news-genre-filter'

In [35]:
recommendations_df = pd.DataFrame.from_records(get_recommendations_response['itemList'])

In [36]:
recommendations_df

Unnamed: 0,itemId,score,metadata
0,3618205920179577598,0.121072,"{'training_text': 'Celcoin allows users to exchange bitcoin for cash credit to use for everyday transactions The crypto platform Celcoin lets users convert bitcoin into cash credit to spend on services like phone topups and bill payments through their app. The CEO of Celcoin, Marcelo França, explained that users can exchange bitcoins for cash credit on the Celcoin platform. They can then use this credit for various functions on the app such as phone topups, bill payments, cash withdrawals, and transfers to other users or bank accounts. Celcoin aims to make using bitcoin easier by allowing users to convert it into cash credit to use for everyday transactions on their platform.'}"
1,672199059798181601,0.055175,"{'training_text': 'Using Docker Hub build hooks to generate dynamic image labels Build hooks for Docker Hub automated builds allow running scripts to generate useful dynamic labels like build date and Git commit hash. This adds value missing from default automated builds. The article discusses using build hooks to generate dynamic labels for Docker images built with automated builds on Docker Hub. Automated builds are popular but don't allow running dynamic code during the build process. This makes it difficult to generate useful labels like org.labelschema.builddate and org.labelschema.vcsref. Build hooks allow running scripts that can build and tag the image. An example build hook is provided that generates these labels. The hook shows it's possible to have useful dynamic labels with automated builds. The article also provides statistics on usage of automated builds 26% of public Docker Hub images use them, accounting for 32% of pulls. Finally it demonstrates the labels on a sample image built this way on Docker Hub and MicroBadger.'}"
2,4607279316199873708,0.026817,"{'training_text': 'Google announces Cloud Search to provide unified search across G Suite Google Cloud Search uses machine learning to index G Suite data like emails and docs. It provides search, recommendations, and automation to help employees work more efficiently. Google is announcing a new product called Google Cloud Search, formerly known as Springboard. Cloud Search uses machine learning to provide a unified search experience across G Suite products. As companies move data to the cloud, searching across different formats like emails, documents, spreadsheets has become complex. Employees spend 20% of their time searching for information. Cloud Search brings Google Search capabilities to G Suite. It provides comprehensive search and proactive recommendations via \""assist cards\"" to help users throughout the day. Assist cards use machine learning to suggest relevant information like files needing attention or preparation for upcoming meetings. They aim to provide timely, relevant recommendations so users can navigate work more efficiently. More assist cards will be added over time as Google learns what information is most useful to users.'}"
3,335910242745901755,0.026018,"{'training_text': 'Google improves Gmail previews with closer Trello and GitHub integration Google announced Gmail will provide better summaries and previews for Trello project updates and GitHub code changes to help users quickly find key information. Google announced closer partnerships with Trello and GitHub to provide better previews in Gmail that help users find information more easily. Trello users will get summaries of project updates. GitHub users will get summaries of code changes and issues for repositories. Inbox by Gmail also improved integration with Google Alerts for filtering web information. The article provides background that the author has a business degree, founded a digital communications company, and has experience in technology columns and projects like Google Discovery, TechCult, AutoBlog, and UFO Archive.'}"
4,-1415040208471067980,0.020862,"{'training_text': 'Google Cloud Platform releases icon library for diagrams Google Cloud Platform published downloadable icons and sample diagrams to help visually represent GCP infrastructure and services. Google Cloud Platform has released downloadable icon sets and sample diagrams to help developers, architects, and partners visually represent complex cloud infrastructure in documents and presentations. The icons cover GCP products, services, and elements that can be combined with other providers' icons to show hybrid/multicloud setups. Over 50 sample diagrams are available in Slide and PowerPoint formats to help kickstart new diagrams. As more products launch, Google will add icons so users should revisit the icon library. This allows accurate, uptodate visual representations of Google Cloud Platform architectures and services.'}"


Much better a variety of articles about google technical products, similar to but different than the users recent reading history - which is great. Now lets send in some interactions with a different sort of article.

Below are five sample to choose from: 

-4996336942690402156 -1878128207048892154 -3058031327323357308 -1633984990770981161 -1633984990770981161

In [47]:
articles_mlfeatures[articles_mlfeatures.item_id.isin([-4996336942690402156, -1878128207048892154, -3058031327323357308, -1633984990770981161, 90383487344892230])]

Unnamed: 0,creation_timestamp,item_id,lang,article_genre,training_text
732,1475614359,-4996336942690402156,en,design best practices,"Protecting periods of deep focus should be a priority for creative work The article discusses strategies to minimize distractions and interruptions in order to achieve a state of focused concentration or \""flow\"" for knowledge workers doing creative tasks. Interruptions and open office plans can severely hamper productivity for workers who need to enter a state of focused concentration or \""flow\"" to do creative work. Frequent task switching and distractions make it difficult to achieve this flow state. While employers claim open offices promote communication, studies show they actually reduce job satisfaction and productivity. To reclaim your focus, turn off as many notifications as possible on your devices. Only allow truly important things like phone calls to get through. Also, try to minimize meetings by asking people to explain their purpose over email first. If working remotely or getting a private office is an option, consider it to eliminate auditory distractions. Following practices like these can help knowledge workers spend more time in an uninterrupted flow state, leading to higher quality work and increased productivity. Protecting periods of deep focus should be a priority for any company that values creativity and output from its employees."
1044,1476787589,-3058031327323357308,en,design best practices,"Follow these key guidelines for better Java API design Good API design balances firm commitment with flexibility. Using a checklist can help avoid common mistakes like returning null or overusing arrays. Here is a more concise summary of the key points from the article: Good API design is crucial for all Java developers, as even code not shared with others is still an API used by the developer themselves. A well designed API balances a firm commitment with flexibility for the implementation. Using a checklist can help avoid common API design mistakes. Some key guidelines: Return Optional instead of null to indicate absence of a value Avoid using arrays to pass values, use Streams instead Provide static factory methods as single entry points for object creation Favor composition with functional interfaces over inheritance Annotate functional interfaces with FunctionalInterface Avoid overloading methods with functional interface parameters Minimize use of default methods in interfaces Validate method parameters before use Avoid directly calling Optional.get(), use other methods Split Stream pipelines across lines for readability Following guidelines like these can lead to more robust, usable and maintainable Java APIs."
1212,1481626644,-1878128207048892154,en,design best practices,"The article outlines timeless principles for effective software development collaboration and code review. Despite being written in 1971, Jerry Weinberg's \""Ten Commandments of Egoless Programming\"" provide enduring human principles for improving code quality through open mindedness and constructive criticism. The article discusses the \""Ten Commandments of Egoless Programming\"" from Jerry Weinberg's book The Psychology of Computer Programming. It outlines principles such as being open to learning from others, critiquing code rather than people, and making positive comments aimed at improving the code. The article notes that despite being written in 1971, the human principles behind software development remain timeless."
1256,1471520397,-1633984990770981161,en,design best practices,"UX and UI are complementary facets of holistic product design While UX focuses on the overall user experience through research and optimization, UI deals with the visual and interactive elements of the product interface. Successful products require a harmonious blend of great UX and UI. UX (User Experience) and UI (User Interface) are often confused, but they are distinct roles in product design. UX focuses on researching and optimizing the overall user experience, including usability testing, information architecture, interaction flows, and understanding user needs. UI, on the other hand, deals with the visual and interactive elements of the product interface like layouts, colors, typography, and animations. While UX and UI are separate disciplines, they are closely intertwined. Good UX drives the UI design choices, ensuring the interface meets user needs identified through research. Conversely, the UI execution impacts the overall user experience. Successful products require a harmonious blend of great UX and UI. The terms emerged in the 1990s as technology advanced rapidly. As digital products proliferated, design specializations arose to tackle the complexities of crafting optimal user experiences and interfaces. Today's designers must be multidisciplinary, drawing from fields like psychology, computer science, and graphic design. Ultimately, isolating UX from UI is misguided. They are complementary facets of holistic product design aimed at creating innovative, user centric experiences. Effective designers incorporate both UX and UI principles into their practice."
1567,1472750564,90383487344892230,en,design best practices,"Microservices should encapsulate both service and failure from day one A lesson learned from building microservices for a legacy banking platform: tightly coupling new microservices to the legacy database caused a 7 month delay due to failure propagation risks. The author explains a lesson learned from a previous project building microservices to extend a legacy retail banking platform. Despite delivering features quickly with the microservices approach, they tightly coupled the new microservices to the legacy system's large shared database due to infrastructure limitations. This tight coupling caused a 7 month delay in releasing the first microservice to production over concerns of impacting millions of users. The key lesson was that a new microservice should encapsulate both the unit of service and unit of failure from day one of development. This principle, based on Jim Gray's ideas on fault tolerance, avoids the risk of one microservice failure propagating beyond its module. The author realized the importance of this approach after their early dependency on the shared database posed too much risk to the production system."


To change the user recommendations to account for recent changes in user behavior in real time we must first deploy an event tracker for the dataset group

In [48]:
event_tracker_name = 'news-event-tracker'

try: 
    create_event_tracker_response = personalize.create_event_tracker(
        name = event_tracker_name,
        datasetGroupArn=workshop_dataset_group_arn
        )
    event_tracker_arn = create_event_tracker_response['eventTrackerArn']
    print(json.dumps(create_event_tracker_response, indent=2))
    print ('\nCreating the event_tracker with event_tracker_arn = {}'.format(event_tracker_arn))
    tracking_id = create_event_tracker_response['trackingId']
    print ('\nAnd trackingId = {}'.format(tracking_id))

except personalize.exceptions.ResourceAlreadyExistsException as e:
    event_tracker_list = personalize.list_event_trackers( 
        datasetGroupArn= workshop_dataset_group_arn
    )['eventTrackers']
    
    event_tracker_arn = event_tracker_list[0]['eventTrackerArn']
    
    describe_event_tracker_response = personalize.describe_event_tracker(
        eventTrackerArn=event_tracker_arn
    )
    tracking_id = describe_event_tracker_response['eventTracker']['trackingId']
    
    print ('\nThe the Event Tracker with event_tracker_name {} already exists'.format(event_tracker_name))
    print ('\nWe will be using the existing Event Tracker with event_tracker_arn = {}'.format(event_tracker_arn))
    print ('\nAnd tracking_id = {}'.format(tracking_id))


The the Event Tracker with event_tracker_name news-event-tracker already exists

We will be using the existing Event Tracker with event_tracker_arn = arn:aws:personalize:us-east-1:908388459961:event-tracker/e0564701

And tracking_id = 1f9ded53-2c43-4404-a7d1-b79b8e84b078


Now lets submit an interaction for our earlier user with one of our above items

In [49]:
response = personalize_events.put_events(
    trackingId=tracking_id,
    userId=sample_user,
    sessionId='session1',
    eventList=[
        {
            'eventId': 'madeupevent1',
            'eventType': 'VIEW',
            'itemId': '-1633984990770981161',
            'sentAt': 1714006143,
        },
    ]
)

In [50]:
response

{'ResponseMetadata': {'RequestId': '3bca8bd4-c00f-440c-92ee-005f6155e362',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Tue, 23 Jul 2024 21:50:25 GMT',
   'content-type': 'application/json',
   'content-length': '0',
   'connection': 'keep-alive',
   'x-amzn-requestid': '3bca8bd4-c00f-440c-92ee-005f6155e362'},
  'RetryAttempts': 0}}

Now lets see how our recommendations change:

In [54]:
get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = workshop_userpersonalization_campaign_arn,
    userId = sample_user,
    numResults = 5,
    filterArn = genre_filter_arn,
    filterValues = {"CUTOFF": "\"0\"", "GENRELIST": "\"tech\",\"non tech\",\"cloud provider news\",\"crypto currency\""},
    metadataColumns = {"ITEMS": ["training_text"]}
)

In [55]:
recommendations_df = pd.DataFrame.from_records(get_recommendations_response['itemList'])

In [56]:
recommendations_df

Unnamed: 0,itemId,score,metadata
0,3618205920179577598,0.068442,"{'training_text': 'Celcoin allows users to exchange bitcoin for cash credit to use for everyday transactions The crypto platform Celcoin lets users convert bitcoin into cash credit to spend on services like phone topups and bill payments through their app. The CEO of Celcoin, Marcelo França, explained that users can exchange bitcoins for cash credit on the Celcoin platform. They can then use this credit for various functions on the app such as phone topups, bill payments, cash withdrawals, and transfers to other users or bank accounts. Celcoin aims to make using bitcoin easier by allowing users to convert it into cash credit to use for everyday transactions on their platform.'}"
1,-6289909056857931861,0.03908,"{'training_text': 'Implementing an effective designerdeveloper workflow for quality UX in banking A UX engineering methodology for banking apps focuses on understanding user needs, analyzing competitors, and implementing an intuitive user interface. To implement strong user design, you need a disciplined designerdeveloper workflow leading to quality implementation on deadline. Today, many discuss UX design in banking but few practice it. A popular question is about UX engineering methodology. While workflows vary, here is one used by UXDA: 1. Agree on the goal and measure success. Document and refer to it daily. 2. Find pain points through user research like surveys and social media monitoring. 3. Check industry publications and apps for ideal banking solutions. Extract principles and use for strategy. 4. Analyze competitors for features and patterns. Consider user perception. 5. Bring passion into UX to inspire and delight users based on their needs. 6. Identify typical personas through interviews. Evaluate solutions from their perspectives. 7. Prioritize tasks for the 20% of features used by 80% of customers. Make them intuitive. 8. Map the ideal user journey combining research to identify bottlenecks and insights. 9. Wireframe interactions for key scenarios to test the solution and get early feedback. 10. Implement the vision into the user interface design after refinements.'}"
2,1469580151036142903,0.032035,"{'training_text': 'Should code be documented or made selfdocumenting? An article discusses the debate around documenting code versus writing clean, selfdocumenting code without separate documentation. The article discusses whether developers should document their code or if writing clean, selfdocumenting code is sufficient. The author used to believe code should be documented, but often found the documentation became outdated. After reading Clean Code, the author was convinced documentation is unnecessary if code uses meaningful names for variables, methods, classes, etc. This allows the code to be read and understood without separate documentation. The author acknowledges complex algorithms may still need documentation. The article concludes by asking readers whether they document code or try to make it selfdocumenting.'}"
3,-624901815223005993,0.028671,"{'training_text': 'Google launches AI Duet to generate musical accompaniments Google's new AI Duet experiment allows anyone to play simple melodies and have AI generate accompanying music in realtime. Users can adjust the AI's musicality and improvisation. Google has launched AI Duet, a new artificial intelligence experiment that lets users play melodies on their computer keyboard or MIDI keyboard, and the AI will automatically generate an accompanying melody. Users don't need any musical ability they can just press a few keys and the AI will respond with complementary notes. Google says this allows anyone to experience how AI can help bring creative ideas to life, whether they are a developer, musician, or just curious. The experiment uses machine learning to generate the duet in real time. Users can adjust the AI's musicality and improvisation to create different effects. AI Duet provides an accessible way for people to interact with AI and make music together.'}"
4,4876769046116846438,0.026817,"{'training_text': 'Brazilian mall partners with startup on sustainable urban farm A mall in Brazil will open an onsite urban farm using greenhouses and recycled shipping containers to produce local organic food and promote sustainability. The Boulevard Shopping mall in Belo Horizonte, Brazil is partnering with startup BeGreen to open the city's first Urban Farm in March 2017. The 2,700 sq m farm will include a 1,500 sq m greenhouse capable of producing 50,000 heads of lettuce and herbs per month. The space will also have a farmtotable restaurant using the produce, a store selling the cultivated products, and an event space promoting sustainable living. The farm will use organic waste from the mall's food court as compost and will be selfsufficient in electricity. The goal is to demonstrate the viability of sustainable local food production that reduces waste and environmental impact. The farm addresses the issue that 80% of Brazil's produce is wasted due to long supply chains from farm to consumer. The onsite farm eliminates transportation and intermediaries so food reaches consumers faster. The construction utilizes reused shipping containers and other recycled materials. The project aims to provide fair trade for local producers who can also sell at the farm's store. It will offer public tours and training to promote the concept of urban agriculture.'}"


The top recommendations have changed a bit - though within the top five will still see a couple related to google technical products. Lets send in some more interactions, usually it is best practice to send in the interactoins as they occur so recommendations with the user can adjust in real time but you can submitted the interactions in batches of up to 10 and we will take advantage of this batching below. We will also say that these interactions are part of our most recent session.

In [57]:
response = personalize_events.put_events(
    trackingId=tracking_id,
    userId=sample_user,
    sessionId='session1',
    eventList=[
        {
            'eventId': 'madeupevent2',
            'eventType': 'VIEW',
            'itemId': '-1878128207048892154',
            'sentAt': 1714011143,
        },
        {
            'eventId': 'madeupevent3',
            'eventType': 'VIEW',
            'itemId': '-3058031327323357308',
            'sentAt': 1714013143,
        },
        {
            'eventId': 'madeupevent4',
            'eventType': 'VIEW',
            'itemId': '-4996336942690402156',
            'sentAt': 1714015143,
        },
        {
            'eventId': 'madeupevent5',
            'eventType': 'VIEW',
            'itemId': '90383487344892230',
            'sentAt': 1714018143,
        },
    ]
)

Now lets see how this changes our recommendations

In [59]:
get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = workshop_userpersonalization_campaign_arn,
    userId = sample_user,
    numResults = 5,
    filterArn = genre_filter_arn,
    filterValues = {"CUTOFF": "\"0\"", "GENRELIST": "\"tech\",\"non tech\",\"cloud provider news\",\"crypto currency\""},
    metadataColumns = {"ITEMS": ["training_text"]}
)

In [60]:
recommendations_df = pd.DataFrame.from_records(get_recommendations_response['itemList'])
recommendations_df

Unnamed: 0,itemId,score,metadata
0,1649752043999819668,0.039094,"{'training_text': 'Node.js benefits and challenges for operations teams Node.js enables faster development but its singlethreaded, asynchronous nature causes issues like memory leaks and CPU blocking that require monitoring and diagnostics. Node.js is a fast growing platform used by many startups and enterprises. It acts as a new tier that connects legacy systems with new technologies. The benefits of Node.js include enabling faster development and deployment. However, its speed introduces challenges for operations and performance teams who focus on availability and performance. Some common Node.js problems these teams face include: 1. Memory leaks Node.js is prone to memory leaks which cause crashes. Heap dumps can help track down the cause. 2. CPU problems Node.js runs singlethreaded so CPUheavy operations can block requests. CPU sampling can identify where time is being spent. 3. Back pressure Slow backends can cause congestion in Node.js. Monitoring intertier communication can identify the root cause. 4. Security Node.js relies heavily on third party modules which can contain vulnerabilities. Scanning for issues and using a private module repository can help. Node.js provides hooks for monitoring and debugging to help diagnose issues. The community is also actively improving tracing and debugging capabilities. With proper monitoring and diagnostics, operations teams can smoothly transition to using Node.js.'}"
1,-3351652027149912881,0.019325,"{'training_text': 'Feather file format improves Python and R dataframe interoperability The new Feather file format uses Apache Arrow columnar data representation to enable fast reading and writing of dataframes between Python and R. Wes McKinney and Hadley Wickham met in January to discuss improving interoperability between Python, R, and external systems. They noticed Python pandas dataframes and R dataframes have similar semantic models but different internal memory representations. Around this time Apache Arrow was started to improve data interoperability for columnar tabular data. They decided to use insights from feather to make a fast file format called Feather for storing dataframes usable by both Python and R. Feather is a fast, lightweight binary file format for storing dataframes with goals of simplicity, language agnosticism, and high read/write performance. The Feather API makes reading/writing dataframes easy. Feather is extremely fast, achieving over 600 MB/s write performance on SSDs. Feather can be installed from GitHub for R and PyPI for Python. It is not designed for longterm storage but for exchanging data between Python and R. Feather brings benefits of the Arrow spec to users through efficient languageagnostic tabular data representation. It uses Flatbuffers to serialize column metadata in a languageindependent way. The Python interface uses Cython and the R interface uses Rcpp to expose the C++11 core.'}"
2,7814856426770804213,0.016591,"{'training_text': 'New MacBook Pro's controversial changes generate mixed reviews The 2016 MacBook Pro replaces function keys with a touch bar and offers only minor RAM and processor improvements, upsetting developers and leading to mixed reviews. The new MacBook Pro announced at Apple's event today has some controversial changes, including replacing the function keys with a touch bar and offering only minor improvements in RAM and processors over previous models. Many developers are upset about the loss of the Esc key in particular. The specs seem outdated compared to cheaper Windows and Linux alternatives. Overall, reviews of the new MacBook Pro are mixed, with many lamenting the high price tag for relatively modest upgrades over earlier models. Some suggest the touch bar could have been added above the function keys instead of replacing them. The 2016 MacBook Pro seems overpriced for the specs offered.'}"
3,-7646922141533719881,0.013971,"{'training_text': 'Integrating microservices with legacy systems using REST APIs Microservices enable legacy systems to have REST APIs by building adapter layers with frameworks like Spring Boot. Allinone solutions like Sensedia provide deployment and management. Microservices architecture has become popular for building scalable and modular applications by decomposing them into independent services. However, integrating microservices with legacy systems that don't have standard interfaces like REST APIs can be challenging. The common solution is to build an adapter or integration layer to expose the legacy system's functionality as a RESTful microservice. These adapters should follow best practices like being lightweight, scalable, and using common standards like Swagger. Popular Java frameworks like Spring Boot and Apache Camel are good choices. Once built, microservices should be deployed to a PaaS like Cloud Foundry, OpenShift or Heroku for containerization, orchestration, monitoring, etc. This provides scalability and automation. Exposed microservices should then be managed by an API Management platform. This provides security, access control, caching, analytics and more. Sensedia API Management Suite offers integrated microservice deployment through its BaaS feature as well as full API management capabilities. This endtoend solution avoids needing separate infrastructure for running microservices. In summary, microservices enable REST APIs for legacy systems. Choosing the right frameworks and platforms is key. Allinone solutions like Sensedia provide microservice deployment and management alongside API management for an optimal approach.'}"
4,-4831310174172854034,0.012447,"{'training_text': 'Netflix's journey to microservices and the cloud enabled continuous delivery and independent teams Netflix adopted microservices and cloud infrastructure for availability, scale, and speed. This let them continuously deliver features through independent teams. Back in 2008, Netflix started their journey to microservices and the cloud for three reasons: availability, scale, and speed. As a 24x7 service, Netflix needs to always be online, but their monolithic codebase made troubleshooting difficult. The company also faced scaling challenges that were hard to address within a single application. Finally, Netflix needed to optimize for speed and agility in a competitive market. Microservices have many individual components, each with a specific responsibility, loosely coupled to each other. Services interact through contracts or messaging, with zero knowledge about implementation details. Microservices also encourage small teams to build and deploy focused services independently and continuously through automation. This is a major shift from monolithic systems built by large teams. Monoliths have interconnected components, making updates risky. Microservices allow fast, incremental changes without fragile dependencies. As an early adopter, Netflix solved complex problems with microservices and shared lessons through open source software. Anyone can now use tools like Spring Cloud to build cloudscale systems with Netflix components. Microservices enable continuous delivery of business value through independent teams and automation.'}"


There we go - now the all of the recommendations are different. The last thing we will go over, is how to use a promotional filter. It is highly recommended to use a promotional filter on the publication date of an article to ensure that a certain percentage (perhaps even 100) of articles published are recent ones. Common filter periods include 2 weeks, 3 days, 1 day or even 12 hours depending on the particular asset we are looking to populate with articles.

In [61]:
get_recommendations_response = personalize_runtime.get_recommendations(
    promotions=[
        {
            'name': 'timefilter',
            'percentPromotedItems': 60,
            'filterArn': genre_filter_arn,
            'filterValues': {"CUTOFF": "\"1480000000\"", "GENRELIST": "\"tech\",\"non tech\",\"cloud provider news\",\"crypto currency\""}
        },
    ],
    campaignArn = workshop_userpersonalization_campaign_arn,
    userId = sample_user,
    numResults = 5,
    filterArn = genre_filter_arn,
    filterValues = {"CUTOFF": "\"0\"", "GENRELIST": "\"tech\",\"non tech\",\"cloud provider news\",\"crypto currency\""},
    metadataColumns = {"ITEMS": ["training_text"]}
)

In [62]:
recommendations_df = pd.DataFrame.from_records(get_recommendations_response['itemList'])
recommendations_df

Unnamed: 0,itemId,score,promotionName,metadata
0,-7646922141533719881,0.074653,timefilter,"{'training_text': 'Integrating microservices with legacy systems using REST APIs Microservices enable legacy systems to have REST APIs by building adapter layers with frameworks like Spring Boot. Allinone solutions like Sensedia provide deployment and management. Microservices architecture has become popular for building scalable and modular applications by decomposing them into independent services. However, integrating microservices with legacy systems that don't have standard interfaces like REST APIs can be challenging. The common solution is to build an adapter or integration layer to expose the legacy system's functionality as a RESTful microservice. These adapters should follow best practices like being lightweight, scalable, and using common standards like Swagger. Popular Java frameworks like Spring Boot and Apache Camel are good choices. Once built, microservices should be deployed to a PaaS like Cloud Foundry, OpenShift or Heroku for containerization, orchestration, monitoring, etc. This provides scalability and automation. Exposed microservices should then be managed by an API Management platform. This provides security, access control, caching, analytics and more. Sensedia API Management Suite offers integrated microservice deployment through its BaaS feature as well as full API management capabilities. This endtoend solution avoids needing separate infrastructure for running microservices. In summary, microservices enable REST APIs for legacy systems. Choosing the right frameworks and platforms is key. Allinone solutions like Sensedia provide microservice deployment and management alongside API management for an optimal approach.'}"
1,-2097075598039554565,0.040539,timefilter,"{'training_text': 'Key software development trends in 2017 The article summarizes notable programming languages, frameworks, databases, tools, and technologies for software development in 2017. The software industry continues to evolve rapidly. In 2017, some key trends are progressive web apps, bots, consolidation of frontend frameworks, cloud computing, and machine learning. For languages, JavaScript remains important with new ES2017 features like async/await. TypeScript 2.1 brings async/await to older browsers. C# 7.0 enhances the language and .NET Core runs crossplatform. Ruby and Python continue to be popular. For frontend, Angular 2 is popular for enterprise apps. Ember provides stability. React has a complex ecosystem. Bootstrap 4 modernizes styling. SASS and LESS remain top CSS preprocessors. For backend, Node.js is popular for JavaScript. Laravel and Symfony are top PHP frameworks. Django and Flask are leading Python frameworks. For databases, MySQL 8, PostgreSQL 9.6, CouchDB, and Redis all have notable improvements. For tools, Yarn is a faster npm alternative. Visual Studio Code and Atom are popular editors. Git enables version control. Electron and NW.js allow desktop apps with web tech. Ansible and Docker are key for DevOps. Other technologies to learn include cloud platforms like AWS, machine learning libraries like TensorFlow, and VR development.'}"
2,1649752043999819668,0.039112,,"{'training_text': 'Node.js benefits and challenges for operations teams Node.js enables faster development but its singlethreaded, asynchronous nature causes issues like memory leaks and CPU blocking that require monitoring and diagnostics. Node.js is a fast growing platform used by many startups and enterprises. It acts as a new tier that connects legacy systems with new technologies. The benefits of Node.js include enabling faster development and deployment. However, its speed introduces challenges for operations and performance teams who focus on availability and performance. Some common Node.js problems these teams face include: 1. Memory leaks Node.js is prone to memory leaks which cause crashes. Heap dumps can help track down the cause. 2. CPU problems Node.js runs singlethreaded so CPUheavy operations can block requests. CPU sampling can identify where time is being spent. 3. Back pressure Slow backends can cause congestion in Node.js. Monitoring intertier communication can identify the root cause. 4. Security Node.js relies heavily on third party modules which can contain vulnerabilities. Scanning for issues and using a private module repository can help. Node.js provides hooks for monitoring and debugging to help diagnose issues. The community is also actively improving tracing and debugging capabilities. With proper monitoring and diagnostics, operations teams can smoothly transition to using Node.js.'}"
3,7065704533945771463,0.032774,timefilter,"{'training_text': 'Brazil's Central Bank may force Nubank to close down A new Central Bank rule in Brazil reducing credit card payment timelines could bankrupt fintech company Nubank, which lacks the capital reserves of large banks. Nubank, a major fintech company in Brazil, may have to close down if the Central Bank confirms a new rule on Tuesday, December 20 that would reduce the payment deadline from credit card companies to merchants from 30 days to just 2 days. This change would negatively impact all credit card companies, but especially smaller ones like Nubank that don't have the capital reserves of large banks. Nubank cofounder Cristina Junqueira criticized the idea, saying the current 30 day period allows time for the customer to pay their credit card bill, the issuer to receive the funds, and the purchaser to pay the retailer. She says reducing to 2 days would force Nubank to pay purchasers before even receiving customer payments. Simulations show that even 15 days would require Nubank to raise almost $1 billion in capital immediately to survive. Junqueira describes the potential change as \""apocalyptic\"" and says with just 2 days, Nubank would have to \""turn off the light and close the door.\"".'}"
4,-3351652027149912881,0.019335,,"{'training_text': 'Feather file format improves Python and R dataframe interoperability The new Feather file format uses Apache Arrow columnar data representation to enable fast reading and writing of dataframes between Python and R. Wes McKinney and Hadley Wickham met in January to discuss improving interoperability between Python, R, and external systems. They noticed Python pandas dataframes and R dataframes have similar semantic models but different internal memory representations. Around this time Apache Arrow was started to improve data interoperability for columnar tabular data. They decided to use insights from feather to make a fast file format called Feather for storing dataframes usable by both Python and R. Feather is a fast, lightweight binary file format for storing dataframes with goals of simplicity, language agnosticism, and high read/write performance. The Feather API makes reading/writing dataframes easy. Feather is extremely fast, achieving over 600 MB/s write performance on SSDs. Feather can be installed from GitHub for R and PyPI for Python. It is not designed for longterm storage but for exchanging data between Python and R. Feather brings benefits of the Arrow spec to users through efficient languageagnostic tabular data representation. It uses Flatbuffers to serialize column metadata in a languageindependent way. The Python interface uses Cython and the R interface uses Rcpp to expose the C++11 core.'}"


## Wrap up <a class="anchor" id="wrapup"></a>
[Back to top](#top)

With that you now have a fully working collection of models to tackle various recommendation and personalization scenarios, as well as the skills to manipulate customer data to better integrate with the service, and a knowledge of how to do all this over APIs and by leveraging open source data science tools.

You'll want to make sure that you clean up all of the resources deployed during this POC. We have provided a separate notebook which shows you how to identify and delete the resources in [`Retail_04_Clean_Up.ipynb`](Retail_04_Clean_Up.ipynb).