In [16]:
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings

In [17]:
pdfLoader = PyMuPDFLoader(file_path="ADD-953.pdf")
pdf = pdfLoader.load()
pages = [page.page_content for page in pdf]
texts = "\n\n".join(pages)
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=1000, chunk_overlap=50)
chunks = text_splitter.split_text(texts)
# texts

In [None]:
embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-V2")
embedding.embed_documents(pages)

(41, 384)

In [19]:
pip install langchain_google_genai

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Note: you may need to restart the kernel to use updated packages.


In [20]:
from langchain_community.vectorstores import FAISS
vector_store = FAISS.from_texts(texts=pages, embedding=embedding)
retriever = vector_store.as_retriever()

In [21]:
pip install python-dotenv

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Note: you may need to restart the kernel to use updated packages.


In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
import os
from dotenv import load_dotenv
load_dotenv()
os.environ["GOOGLE_API_KEY"] = os.getenv("GOOGLE_API_KEY")
llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.0)

In [32]:
from langchain_core.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain_core.runnables import RunnablePassthrough



chat_template = ChatPromptTemplate.from_messages([
    SystemMessagePromptTemplate.from_template(
        "Search in provided context {context} and relate it with OTT platforms like HBO MAx, Netflix, Hulu, Peacock etc..,"
        ), 
    HumanMessagePromptTemplate.from_template("{question}")]
    )
chain = ({"context": retriever, "question": RunnablePassthrough()} | chat_template | llm)

AIMessage(content='Based on the provided documents, **Commerce SKU Setup** refers to the process of defining and organizing the various products, price plans, and their unique identifiers (SKUs) within a commerce system, specifically for managing subscriptions and entitlements, particularly in a TV Everywhere (TVE) context.\n\nHere\'s a breakdown of the concept as described:\n\n1.  **Individual B2B Products and Price Plans per Channel:**\n    *   Each content channel (e.g., TNT, TBS, TruTV) is treated as its own "b2b product."\n    *   Each of these products will have "associated price plans." This implies different ways the content can be offered or bundled.\n\n2.  **Mapping to SKUs and Partner IDs:**\n    *   These "PricePlans" are then mapped to specific **SKUs (Stock Keeping Units)**. An SKU is a unique identifier for a particular product/price plan combination.\n    *   They are also mapped to **partnerIds** in a "PartnerSKU mapping." This is crucial for TVE, as it links a specifi

In [42]:
def self_rag( question):
    first_answer = llm.invoke(question)
    if 'TNT' not in first_answer:
        print('llm could not predict on its own. Hence generating from context provided')
        return chain.invoke(question)
    else:
        return first_answer
result = self_rag(question="Explain commerce SKU setup")

llm could not predict on its own. Hence generating from context provided


In [45]:
from IPython.display import Markdown, display
display(Markdown(result.content))
# result

Based on the provided context, the **Commerce SKU Setup** describes how different content channels are configured as products with associated pricing for business-to-business (B2B) transactions and subscription management.

Here's a breakdown:

1.  **Individual B2B Products and Price Plans:** Each channel (e.g., TNT, TBS, TruTV) is treated as its own distinct B2B product. Each product has one or more associated price plans, defining what the product costs and what it includes.

2.  **SKU and Partner ID Mapping:** These price plans are then mapped to specific **SKUs (Stock Keeping Units)**. An SKU is a unique identifier for a product or service, used for inventory and sales tracking. Alongside the SKUs, `partnerIds` are also mapped in a `PartnerSKU` mapping system. This likely links the specific product/price plan (SKU) to a particular partner or distributor.

3.  **GAuth for Subscription Management:** `GAuth` (presumably an authentication and authorization system) is responsible for managing `PartnerSubscriptions` for each channel individually. This means `GAuth` tracks which partners have subscriptions to which channels/products.

4.  **Multi-SKU API:** To streamline the process, `GAuth` can use a multi-SKU API. This allows for the creation of multiple subscriptions (potentially for different channels or different tiers) in a single transaction or operation.

**Relating this to OTT platforms like HBO Max, Netflix, Hulu, Peacock:**

Imagine these OTT platforms as the "channels" in the context:

*   **HBO Max:**
    *   **B2B Product:** "HBO Max Ad-Free Subscription," "HBO Max With Ads Subscription."
    *   **Price Plans:** Different monthly/annual rates for each tier.
    *   **SKUs:** Each specific subscription tier (e.g., "HBOMAX_ADFREE_MONTHLY," "HBOMAX_ADS_ANNUAL") would have a unique SKU.
    *   **PartnerSKU Mapping:** If HBO Max is offered as an add-on through a cable provider (e.g., Comcast) or a mobile carrier (e.g., AT&T), that provider would be a "partner." The `PartnerSKU` mapping would link Comcast's `partnerId` to the "HBOMAX_ADFREE_MONTHLY" SKU, indicating that Comcast can sell or bundle that specific HBO Max product.
    *   **GAuth/Subscription Management:** When you subscribe to HBO Max, their internal system (like `GAuth`) creates a `PartnerSubscription` for you, linking your account to the specific SKU you purchased. If you add a sports package later, the multi-SKU API could be used to add that new SKU to your existing subscription.

*   **Netflix, Hulu, Peacock:** The same logic applies:
    *   **Netflix:** "Basic," "Standard," "Premium" plans each have their own B2B product definition, price plans, and unique SKUs.
    *   **Hulu:** "Hulu (ad-supported)," "Hulu (No Ads)," "Hulu + Live TV" are distinct products with SKUs. If you bundle Disney+ and ESPN+, that's a multi-SKU transaction.
    *   **Peacock:** "Peacock Premium," "Peacock Premium Plus" would follow this structure.

In essence, the Commerce SKU Setup is the backend system that defines, prices, and tracks the various subscription offerings (products) of an OTT platform, allowing for flexible bundling, partner integration, and efficient subscription management.