## Scenario 2 -  Multi-tenancy

### **SupportWizard** - Support Analyis SaaS Platform

- Allow its users to sign up and upload their own customer support data
- They would use the platform to information to identify where they could improve their support processes

### Solution 

Each end user will have their own isolated "space", to which they can uplaod data. Then, they can use SupportWizard dashboards / platform to see analyses of their own data. 

## Set your preferred model type here

In [1]:
# model_type = "ollama"
model_type = "cohere"

### Then, run the cell below

In [2]:
from weaviate.classes.config import Configure

if model_type == "ollama":
    vectorizer_config = Configure.NamedVectors.text2vec_ollama(
        name="text_with_metadata",
        source_properties=["text", "company_author"],
        vector_index_config=Configure.VectorIndex.hnsw(),
        api_endpoint="http://host.docker.internal:11434",
        model="nomic-embed-text",
    )
    generative_config = Configure.Generative.ollama(
        api_endpoint="http://host.docker.internal:11434",
        model="gemma2:2b"
    )
else:
    vectorizer_config = Configure.NamedVectors.text2vec_cohere(
        name="text_with_metadata",
        source_properties=["text", "company_author"],
        vector_index_config=Configure.VectorIndex.hnsw(),
        model="embed-multilingual-light-v3.0",
    )

    generative_config = Configure.Generative.cohere(
        model="command-r-plus"
    )



### Create the collection


In [3]:
import os
import weaviate
from weaviate.classes.config import Property, DataType, Configure
from dotenv import load_dotenv

load_dotenv()

client = weaviate.connect_to_local(
    headers={"X-Cohere-Api-Key": os.getenv("WORKSHOP_COHERE_KEY")}
)

collection_name = "SupportChat"

# For re-running the demo only: Delete existing collection if it exists
client.collections.delete(collection_name)

# Create a new collection with specified properties and vectorizer configuration
chunks = client.collections.create(
    name=collection_name,
    properties=[
        Property(name="text", data_type=DataType.TEXT),
        Property(name="dialogue_id", data_type=DataType.INT),
        Property(name="company_author", data_type=DataType.TEXT),
        Property(name="created_at", data_type=DataType.DATE),
    ],
    vectorizer_config=[vectorizer_config],
    generative_config=generative_config,
    # ============================================================
    # ⬇️⬇️ This is the only change from the previous script ⬇️⬇️
    # ============================================================
    multi_tenancy_config=Configure.multi_tenancy(enabled=True, auto_tenant_creation=True)
)

### Helper functions for loading data

In [4]:
import h5py
import json
import numpy as np
from typing import Literal
from pathlib import Path


def get_hdf5_obj(file_path):
    with h5py.File(file_path, "r") as hf:
        for uuid in hf.keys():
            src_obj = hf[uuid]

            # Get the object properties
            properties = json.loads(src_obj["object"][()])

            # Get the vector(s)
            vectors = {}
            for key in src_obj.keys():
                if key.startswith("vector_"):
                    vector_name = key.split("_", 1)[1]
                    vectors[vector_name] = np.asarray(src_obj[key])

            yield uuid, properties, vectors


def get_data_obj(model_type: Literal["ollama", "cohere"]):
    file_path = Path("data/twitter_customer_support_nomic.h5")
    if model_type == "cohere":
        file_path = Path("data/twitter_customer_support_cohere.h5")

    for uuid, properties, vectors in get_hdf5_obj(file_path):
        yield uuid, properties, vectors

### Load data

In [5]:
from tqdm import tqdm

tenant_names = ["AcmeCo", "Globex", "Initech", "UmbrellaCorp", "WayneEnterprises"]

with client.batch.fixed_size(batch_size=200) as batch:
    for uuid, properties, vectors in tqdm(get_data_obj(model_type)):

        # Assign a tenant to object based on the company author
        tenant_index = len(properties['company_author']) % 5
        tenant_name = tenant_names[tenant_index]

        # Add the object to the batch
        batch.add_object(
            collection=collection_name,
            uuid=uuid,
            properties=properties,
            vector={"text_with_metadata": vectors["text_with_metadata"]},
            tenant=tenant_name  # <===== This is the only line that changes during import
        )


50000it [00:19, 2583.22it/s]


In [6]:
print(f"Processed {len(client.batch.results.objs.all_responses)} objects.")

Processed 50000 objects.




In [7]:
if len(client.batch.failed_objects) > 0:
    print("*" * 80)
    print(f"***** Failed to add {len(client.batch.failed_objects)} objects *****")
    print("*" * 80)
    print(client.batch.failed_objects[:3])

### Confirm data load

In [8]:
support_chats = client.collections.get(collection_name)

In [9]:
support_chats.tenants.get()


{'Globex': Tenant(name='Globex', activityStatusInternal=<TenantActivityStatus.ACTIVE: 'ACTIVE'>, activityStatus=<_TenantActivistatusServerValues.HOT: 'HOT'>),
 'AcmeCo': Tenant(name='AcmeCo', activityStatusInternal=<TenantActivityStatus.ACTIVE: 'ACTIVE'>, activityStatus=<_TenantActivistatusServerValues.HOT: 'HOT'>),
 'Initech': Tenant(name='Initech', activityStatusInternal=<TenantActivityStatus.ACTIVE: 'ACTIVE'>, activityStatus=<_TenantActivistatusServerValues.HOT: 'HOT'>),
 'UmbrellaCorp': Tenant(name='UmbrellaCorp', activityStatusInternal=<TenantActivityStatus.ACTIVE: 'ACTIVE'>, activityStatus=<_TenantActivistatusServerValues.HOT: 'HOT'>),
 'WayneEnterprises': Tenant(name='WayneEnterprises', activityStatusInternal=<TenantActivityStatus.ACTIVE: 'ACTIVE'>, activityStatus=<_TenantActivistatusServerValues.HOT: 'HOT'>)}

In [10]:
tenant_data = support_chats.with_tenant(tenant_names[0])

In [11]:
response = tenant_data.query.fetch_objects(limit=2, include_vector=True)

In [12]:
print(response.objects[0].uuid)

00023216-25c8-5d2c-bd5e-fa5bd486b566


In [13]:
for k, v in response.objects[0].properties.items():
    print(f"\n|| {k} || \n{v}")


|| text || 
User_124655: Argh, Hey Amazon! WHERE IS MY PACKAGE? Arriving between Nov 1&amp;3 doesn't mean shop Nov. 2 and maybe you'll get it the next day. WTF?
AmazonHelp: If your order doesn't arrive on the 3rd by 8:00, please contact us for options: https://t.co/EKXRLsnxJu ^AF

|| company_author || 
AmazonHelp

|| created_at || 
2017-11-01 14:53:08+00:00

|| dialogue_id || 
39592


In [14]:
for k, v in response.objects[0].vector.items():
    print(k, v[:3])

text_with_metadata [-0.1064453125, 0.04510498046875, 0.037200927734375]


### Queries

#### Helper function for displaying objects

In [15]:
def display_objects(response):
    for o in response.objects:
        print(o.uuid, "\n")
        print(o.properties["text"][:200], "\n")

In [16]:
response = tenant_data.query.near_text("return process", limit=3)
display_objects(response)

31a21f1b-61aa-5c01-bcbf-6dce68b2cb5e 

User_119904: @115850 I have bought a product and now it's size is not matching I want to return it and also requested return process.
AmazonHelp: You may refer here: https://t.co/M27c4qF86m for detail 

3247d280-de61-515f-a388-caaac81770c8 

User_206228: please check the DM sent to @AmazonHelp abd revert.
AmazonHelp: Hi, we have responded to you via DM. Please refer. ^RD
User_206228: @AmazonHelp Your response is of no help to me.I have al 

09648cc9-4e90-5e83-b6f7-ea8245021db8 

User_144630: I marked a package I got from @115821 today for a return and THEY ARE GOING TO CHARGE ME FOR RETURN SHIPPING for the first time ever. What's that about? Holidays? Or new all-the-time poli 



In [17]:
response = tenant_data.query.bm25("return process", limit=3)
display_objects(response)

31a21f1b-61aa-5c01-bcbf-6dce68b2cb5e 

User_119904: @115850 I have bought a product and now it's size is not matching I want to return it and also requested return process.
AmazonHelp: You may refer here: https://t.co/M27c4qF86m for detail 

09f5607c-2d4e-54f5-bcbc-74012994eb80 

User_295654: I have purchased Lenovo k8 note ....  But heating mobile I want to return it what is process
AmazonHelp: We're responding to your query via DM. Would request you to have a look at it. ^AB 

3c0cba72-874e-5119-aa3d-42dc3b480127 

User_233994: Currently trying to return a £7.99 book to you and there are no options to do so that don't cost me £3.99 which is absolutely appalling? AS IF it costs £4 to send something back?!?!
Amazo 



In [18]:
response = tenant_data.query.hybrid("return process", limit=3)
display_objects(response)

31a21f1b-61aa-5c01-bcbf-6dce68b2cb5e 

User_119904: @115850 I have bought a product and now it's size is not matching I want to return it and also requested return process.
AmazonHelp: You may refer here: https://t.co/M27c4qF86m for detail 

3247d280-de61-515f-a388-caaac81770c8 

User_206228: please check the DM sent to @AmazonHelp abd revert.
AmazonHelp: Hi, we have responded to you via DM. Please refer. ^RD
User_206228: @AmazonHelp Your response is of no help to me.I have al 

09648cc9-4e90-5e83-b6f7-ea8245021db8 

User_144630: I marked a package I got from @115821 today for a return and THEY ARE GOING TO CHARGE ME FOR RETURN SHIPPING for the first time ever. What's that about? Holidays? Or new all-the-time poli 



In [19]:
response = tenant_data.generate.fetch_objects(
    limit=20,
    grouped_task="What patterns are we seeing here in these issues?"
)

In [20]:
print(response.generated)

Based on the provided data, there are several patterns that can be observed in the issues presented:

- **Late or missing deliveries:** Customers are inquiring about the status of their packages, with some expressing frustration over late or missing deliveries.
- **Customer support:** Many users are reaching out to the companies' customer support teams for assistance with various issues, including refunds, product availability, and technical problems.
- **Product or service complaints:** Some customers are expressing dissatisfaction with the quality of products or services received, such as meals on board a flight or the placement of products in a store.
- **Account and payment inquiries:** Users are seeking information or assistance with account-related matters, such as tax benefits for business accounts or payment options (e.g., card installments).
- **Feedback and suggestions:** Customers are providing feedback and suggestions to the companies, such as requesting improvements to onl

## Example use cases

- Each end user (tenant) can upload & analyse their own data
- Analyse different aspects of their own support processes

In [21]:
response = tenant_data.generate.near_text(
    query="return process",
    limit=15,
    grouped_task="What are some of the problems our customers are having, and suggest areas to investigate for improvement.",
)

In [22]:
print(response.generated)

Here are some of the problems that customers are facing, along with suggested areas for improvement:

- **Problem:** Customers are confused and frustrated about return shipping charges and want free and convenient return options.
**Improvement area:** Review and communicate your return policy more transparently and consider offering free returns or providing clearer information on the website and during the purchase process about when return shipping charges apply.

- **Problem:** There are delays and a lack of clarity in the refund process after returning items.
**Improvement area:** Improve communication about the expected timeline for refunds and provide regular updates to customers. Ensure that your customer support team has the necessary tools and training to efficiently track and manage refunds.

- **Problem:** Customers face challenges with the return pickup process, including missed pickups and lengthy wait times for the return to be collected.
**Improvement area:** Enhance you

In [23]:
response = tenant_data.generate.near_text(
    query="phone battery",
    limit=15,
    grouped_task="What types of issues are our users having with their phone batteries?",
)

In [24]:
print(response.generated)

The users are facing a range of issues with their phone batteries, including:
- Rapid battery drain: Many users have reported that their phone batteries are draining quickly, sometimes within an hour or a few hours of usage. This issue seems to be prevalent after upgrading to iOS 11.
- Restarting at 100% battery: One user has mentioned that their phone keeps restarting when the battery reaches 100%.
- Exploding batteries: A few users have reported cases of their iPhone batteries exploding, which is a serious safety concern.
- Inaccurate battery percentage: One user has mentioned that their iPhone 8+ showed a battery drain from 75% to 49% within three hours, indicating a possible issue with the battery percentage indicator.
- Double charging: One user has reported being double-charged for their phone plan by their service provider.
- Defective batteries: One user has shared their experience with a potentially defective battery, for which they were asked to pay for a replacement by the m

## Tenant management

Given that our "tenants" represent different end users, it would be useful to have a way to manage them.

What can we do when:

- A new user signs up?
- A user wants to delete their account?
- A user asks about data privacy?
- A user is inactive for a long time?

#### Tenant deletion

In [25]:
support_chats.tenants.get()

{'Globex': Tenant(name='Globex', activityStatusInternal=<TenantActivityStatus.ACTIVE: 'ACTIVE'>, activityStatus=<_TenantActivistatusServerValues.HOT: 'HOT'>),
 'AcmeCo': Tenant(name='AcmeCo', activityStatusInternal=<TenantActivityStatus.ACTIVE: 'ACTIVE'>, activityStatus=<_TenantActivistatusServerValues.HOT: 'HOT'>),
 'Initech': Tenant(name='Initech', activityStatusInternal=<TenantActivityStatus.ACTIVE: 'ACTIVE'>, activityStatus=<_TenantActivistatusServerValues.HOT: 'HOT'>),
 'UmbrellaCorp': Tenant(name='UmbrellaCorp', activityStatusInternal=<TenantActivityStatus.ACTIVE: 'ACTIVE'>, activityStatus=<_TenantActivistatusServerValues.HOT: 'HOT'>),
 'WayneEnterprises': Tenant(name='WayneEnterprises', activityStatusInternal=<TenantActivityStatus.ACTIVE: 'ACTIVE'>, activityStatus=<_TenantActivistatusServerValues.HOT: 'HOT'>)}

#### Tenant creation

In [26]:
from weaviate.classes.tenants import Tenant

support_chats.tenants.create(
    tenants=[
        Tenant(name="MarvellousCorp"),
        Tenant(name="InGenCompany"),
    ]
)

In [27]:
marvel_tenant = support_chats.with_tenant("MarvellousCorp")

some_objs = [
    {"text": "This comic is great", "dialogue_id": 123, "company_author": "Marvel"},
    {"text": "I am very excited about the new movie", "dialogue_id": 124, "company_author": "Marvel"},
]

marvel_tenant.data.insert_many(some_objs)

BatchObjectReturn(_all_responses=[UUID('0bc4a984-86a7-4354-83e1-2ceaf722c928'), UUID('271f5e82-ac9f-4601-a25b-3366107c8846')], elapsed_seconds=0.1732959747314453, errors={}, uuids={0: UUID('0bc4a984-86a7-4354-83e1-2ceaf722c928'), 1: UUID('271f5e82-ac9f-4601-a25b-3366107c8846')}, has_errors=False)

In [28]:
response = marvel_tenant.query.fetch_objects(limit=2)
for o in response.objects:
    print(o.properties["text"])

This comic is great
I am very excited about the new movie


#### Tenant privacy

Can multiple tenants be queried at once?

In [29]:
response = support_chats.query.fetch_objects(limit=2)

print(response.objects)

WeaviateQueryError: Query call with protocol GRPC search failed with message <AioRpcError of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "explorer: list class: search: object search at index supportchat: class SupportChat has multi-tenancy enabled, but request was without tenant"
	debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"explorer: list class: search: object search at index supportchat: class SupportChat has multi-tenancy enabled, but request was without tenant", grpc_status:2, created_time:"2025-01-22T11:13:44.127159+00:00"}"
>.

#### Tenant state management



You can set tenant activity statues to manage their resource usage, and trade off between availability.

In [30]:
from weaviate.classes.tenants import Tenant, TenantActivityStatus

support_chats.tenants.update(tenants=[
    Tenant(
        name="UmbrellaCorp",
        activity_status=TenantActivityStatus.INACTIVE
    ),
    Tenant(
        name="Globex",
        activity_status=TenantActivityStatus.INACTIVE
    ),
    Tenant(
        name="WayneEnterprises",
        activity_status=TenantActivityStatus.INACTIVE
    ),
])

In [31]:
wayne_tenant = support_chats.with_tenant("WayneEnterprises")

response = wayne_tenant.query.fetch_objects(limit=2)
for o in response.objects:
    print(o.properties["text"])

WeaviateQueryError: Query call with protocol GRPC search failed with message <AioRpcError of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "explorer: list class: search: object search at index supportchat: tenant not active: 'WayneEnterprises'"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2025-01-22T11:14:47.934817+00:00", grpc_status:2, grpc_message:"explorer: list class: search: object search at index supportchat: tenant not active: \'WayneEnterprises\'"}"
>.

In [32]:
from weaviate.classes.tenants import Tenant, TenantActivityStatus

support_chats.tenants.update(tenants=[
    Tenant(
        name="WayneEnterprises",
        activity_status=TenantActivityStatus.ACTIVE
    ),
])

In [33]:
wayne_tenant = support_chats.with_tenant("WayneEnterprises")

response = wayne_tenant.query.fetch_objects(limit=2)
for o in response.objects:
    print(o.properties["text"])

User_171012: thank you customer relations team, after emailing them they sorted out my issue. Shame customer service  on phone couldn't
VirginAtlantic: Sorry for any frustration - glad the customer relations team could help ^S
User_188433: When u order a bread bowl from @116253 and they forget to give it to you 😡😤🙄
askpanera: Oh no - The Bread Bowl is the best part! Mind sharing your order number and bakery-cafe location in a DM?


#### Tenant deletion

Off-boarding customers is super important, but easy with Weaviate. 

Deleting a tenant deletes all of the associated data.


### Demo application

- Outside of the notebook
