In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

True

In [24]:
os.environ["api_key"]=os.getenv("GROQ_API_KEY")

In [38]:
from langchain_groq import ChatGroq
llm=ChatGroq(
    model_name="llama-3.3-70b-versatile",api_key="api_key")

In [26]:
import json
# Load JSON data
with open("Verifast_assignment_raw_data.json", "r") as f:
    data = json.load(f)

# Extract relevant information from JSON
texts = []
for item in data:
    event = item.get("event", "Unknown Event")
    customer_id = item["properties"].get("customer_id", "Unknown Customer")
    product_info = item["properties"].get("payload", {}).get("data", {}).get("productVariant", {}).get("product", {})
    product_title = product_info.get("title", "Unknown Product")
    product_url = product_info.get("url", "Unknown URL")
    texts.append(f"Event: {event}\nCustomer ID: {customer_id}\nProduct: {product_title}\nURL: {product_url}")

In [27]:
texts

['Event: product_viewed\nCustomer ID: 9dc992e7-619f-42b2-8847-2661f18171c5\nProduct: Restorative Shampoo for Anti Dandruff Treatment\nURL: /products/restorative-shampoo-for-sensitive-scalp',
 'Event: product_viewed\nCustomer ID: 0b307d43-5d30-4ff2-8aa8-98a8e50c921d\nProduct: Restorative Shampoo for Anti Dandruff Treatment\nURL: /products/restorative-shampoo-for-sensitive-scalp',
 'Event: product_viewed\nCustomer ID: d37578dd-7de0-4870-b128-3cce8fc9ac5e\nProduct: Small Patches - Eczema and Psoriasis Care Bundle\nURL: /products/small-patches-eczema-and-psoriasis-care-bundle',
 'Event: product_viewed\nCustomer ID: 816e25d5-fbbe-4c01-823a-b57007d29241\nProduct: Ultra Healing Cream (for Dry/ Cracked skin, Psoriasis or Eczema Patches & Flare Ups) (40gm)\nURL: /products/ultra-healing-foot-cream-40gm',
 'Event: product_viewed\nCustomer ID: a21aba27-0a9f-43df-9148-d9dac3af4741\nProduct: Skin Softening Body Oil (for Dry Skin, Eczema and Psoriasis prone skin, Scars and Marks) (100ml)\nURL: /produ

In [28]:
# Splitting texts into smaller chunks
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
def summarize_text(text):
    if len(text) > 1000:  # If text is too long, summarize
        return text[:1000] + "..."
    return text

In [29]:
chunks = text_splitter.split_text("\n".join(texts))

chunks = [summarize_text(chunk) for chunk in chunks]

In [30]:
# Generate embeddings
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vector_db = FAISS.from_texts(chunks, embeddings)

In [31]:
vector_db

<langchain_community.vectorstores.faiss.FAISS at 0x186076814e0>

In [32]:
from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(
    llm=llm, 
    retriever=vector_db.as_retriever(search_kwargs={"k": 5}),
    chain_type="stuff",
    return_source_documents=True
)


In [33]:
if __name__ == "__main__":
    question = "Which products were viewed by customers?"
    answer = qa_chain.invoke(question)
    print("Answer:", answer)

Answer: {'query': 'Which products were viewed by customers?', 'result': 'The following products were viewed by customers:\n\n1. Restorative Shampoo for Anti Dandruff Treatment\n2. Small Patches - Eczema and Psoriasis Care Bundle\n3. Ultra Healing Cream (for Dry/ Cracked skin, Psoriasis or Eczema Patches & Flare Ups) (40gm)\n4. Skin Softening Body Oil (for Dry Skin, Eczema and Psoriasis prone skin, Scars and Marks) (100ml)', 'source_documents': [Document(id='48e9ca7f-6412-4a11-99ab-767e7d4410aa', metadata={}, page_content='Event: product_viewed\nCustomer ID: 9dc992e7-619f-42b2-8847-2661f18171c5\nProduct: Restorative Shampoo for Anti Dandruff Treatment\nURL: /products/restorative-shampoo-for-sensitive-scalp\nEvent: product_viewed\nCustomer ID: 0b307d43-5d30-4ff2-8aa8-98a8e50c921d\nProduct: Restorative Shampoo for Anti Dandruff Treatment\nURL: /products/restorative-shampoo-for-sensitive-scalp\nEvent: product_viewed\nCustomer ID: d37578dd-7de0-4870-b128-3cce8fc9ac5e\nProduct: Small Patches

In [34]:
if __name__ == "__main__":
    question = "Perform All statistical test you can perfrom on this data and we give results which can increase my ROI?"
    answer = qa_chain.invoke(question)
    print("Answer:", answer)

Answer: {'query': 'Perform All statistical test you can perfrom on this data and we give results which can increase my ROI?', 'result': "Based on the provided data, I can perform some basic statistical tests to gain insights that might help increase ROI. Please note that the data is limited, and more comprehensive analysis would require additional information.\n\n**Descriptive Statistics:**\n\n1. **Total number of events:** 5\n2. **Unique Customer IDs:** 5\n3. **Unique Products:** 4\n4. **Most viewed product:** Restorative Shampoo for Anti Dandruff Treatment (2 views)\n5. **Least viewed product:** All other products (1 view each)\n\n**Inferential Statistics:**\n\n1. **Product popularity:** Since we have a small sample size, I'll use a simple frequency analysis. The Restorative Shampoo for Anti Dandruff Treatment is the most popular product, with 2 views. This might indicate a higher demand for this product.\n2. **Customer behavior:** With only 5 events, it's challenging to draw conclus

In [35]:
# Query function
def query_llm(question: str):
    response = qa_chain.invoke(question)
    return response["result"]

# Example query 1
if __name__ == "__main__":
    question = "Perform All statistical test you can perfrom on this data and we give results which can increase my ROI?"
    answer = query_llm(question)
    print(answer)


Based on the provided data, I can perform some basic statistical tests to gain insights. However, please note that the data is limited, and more advanced analysis might require additional information.

**Data Summary:**

* Total Events: 5
* Unique Customer IDs: 5
* Unique Products: 4
* Unique URLs: 4

**Statistical Tests:**

1. **Frequency Analysis:**
	* Most viewed product: Restorative Shampoo for Anti Dandruff Treatment (2 views)
	* Most viewed URL: /products/restorative-shampoo-for-sensitive-scalp (2 views)
2. **Customer Behavior:**
	* Each customer viewed only one product.
	* No customer viewed the same product twice.
3. **Product Popularity:**
	* Restorative Shampoo for Anti Dandruff Treatment: 2 views
	* Small Patches - Eczema and Psoriasis Care Bundle: 1 view
	* Ultra Healing Cream: 1 view
	* Skin Softening Body Oil: 1 view
4. **URL Popularity:**
	* /products/restorative-shampoo-for-sensitive-scalp: 2 views
	* /products/small-patches-eczema-and-psoriasis-care-bundle: 1 view
	* /

In [36]:
# Example query 2
if __name__ == "__main__":
    question = "You are an ecommerce expert,you have generated sales of 1000cr in past and how all the knowledge required to scale business,seeing the data provided what suggestions you can make to improve your business"
    answer = query_llm(question)
    print(answer)

Based on the provided data, I can see that there are multiple customers viewing products related to skin and scalp care, specifically for conditions like eczema, psoriasis, and dandruff. Here are some suggestions to improve the business:

1. **Product Bundling**: Notice that multiple customers are viewing products that cater to similar skin conditions. Consider offering product bundles or kits that include a combination of these products, such as a "Psoriasis Care Kit" or "Eczema Relief Bundle". This can increase average order value and provide customers with a comprehensive solution.
2. **Content Marketing**: Create informative blog posts, videos, or guides that provide valuable information on managing eczema, psoriasis, and dandruff. This can help establish the brand as an authority in the skin care niche and attract more customers searching for solutions online.
3. **Targeted Advertising**: Use the data to create targeted ads on social media platforms, Google Ads, or email marketing