In [3]:
from chromadb import PersistentClient
from chromadb.utils import embedding_functions
import os

topic = 'healthanddata'

chroma_client = PersistentClient(path=f"data/{topic}/storage")
emb_fn = embedding_functions.OpenAIEmbeddingFunction(
    api_key='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',
    model_name="text-embedding-ada-002"
)

In [4]:
desc = {
    'fullform': 'Healthcare and the related data laws and breaches',
}
topic_collection = chroma_client.get_or_create_collection(name=topic, embedding_function=emb_fn, metadata={"hnsw:space": "cosine"})

In [5]:
topic_collection.peek()

{'ids': [],
 'embeddings': [],
 'metadatas': [],
 'documents': [],
 'uris': None,
 'data': None}

In [6]:
from prompts.general import PROMPT_TO_EXTRACT_TRIPLETS
print(PROMPT_TO_EXTRACT_TRIPLETS)

ModuleNotFoundError: No module named 'prompts'

In [24]:
num_triplets = 5
requirement = 'understand the implications of the act on the organizations and how they can comply with the act'
first_chunk = """The bill grouped personal data into different categories and required elevated levels of protection for “sensitive” and “critical” personal data. Certain businesses were also to be categorized as “significant data fiduciaries,” and additional obligations were proposed for them—registration in India, data audits, and data impact assessments."""
first_triplets = (
    "(personal data, grouped into, different categories - sensitive and critical)\n"
    "(certain businesses, to be categorized as, significant data fiduciaries)\n"
    "(significant data fiduciaries, to require, additional obligations - registration in India, data audits, and data impact assessments)\n"
)
second_chunk = """The DPDP Act applies to Indian residents and businesses collecting the data of Indian residents. Interestingly, it also applies to non-citizens living in India whose data processing “in connection with any activity related to offering of goods or services” happens outside India."""
second_triplets = (
    "(DPDP Act, applies to, Indian residents and businesses collecting the data of Indian residents)\n"
    "(DPDP Act, applies to, non-citizens living in India whose data processing in connection with any activity related to offering of goods or services happens outside India)\n"
)
prompt = PROMPT_TO_EXTRACT_TRIPLETS.replace("<<topic>>", topic).replace("<<requirement>>", requirement).replace("<<FIRST_CHUNK_EX>>", first_chunk).replace("<<FIRST_CHUNK_TRIPLETS>>", first_triplets).replace("<<SECOND_CHUNK_EX>>", second_chunk).replace("<<SECOND_CHUNK_TRIPLETS>>", second_triplets).replace("<<num_triplets>>", str(num_triplets))
print(prompt)

Some text is provided below related to DPDP. Given the text, to better understand understand the implications of the act on the organizations and how they can comply with the act, extract up to 5 knowledge triplets in the form of (parent_topic, relation_type, topic). Avoid stopwords.
---------------------
Example:Text: The bill grouped personal data into different categories and required elevated levels of protection for “sensitive” and “critical” personal data. Certain businesses were also to be categorized as “significant data fiduciaries,” and additional obligations were proposed for them—registration in India, data audits, and data impact assessments.Triplets:
(personal data, grouped into, different categories - sensitive and critical)
(certain businesses, to be categorized as, significant data fiduciaries)
(significant data fiduciaries, to require, additional obligations - registration in India, data audits, and data impact assessments)

Text: The DPDP Act applies to Indian reside

In [25]:
from llama_index.llms import OpenAI
llm = OpenAI(model="gpt-3.5-turbo")

In [27]:
resp = llm.complete(prompt)
print(resp.text)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
(DPDP Act, requires, elevated levels of protection)
(DPDP Act, requires, registration in India)
(DPDP Act, requires, data audits)
(DPDP Act, requires, data impact assessments)
(DPDP Act, applies to, organizations)


In [28]:
from llama_index import SimpleDirectoryReader

reader = SimpleDirectoryReader(input_dir="data/DPDP")
documents = reader.load_data()

In [31]:
from llama_index.node_parser import SentenceSplitter

# parse nodes
parser = SentenceSplitter()
nodes = parser.get_nodes_from_documents(documents)

In [32]:
nodes

[TextNode(id_='0922d34a-9e7a-430a-bd35-4b717b8de9af', embedding=None, metadata={'file_path': 'data/DPDP/01.txt', 'file_name': '01.txt', 'file_type': 'text/plain', 'file_size': 42776, 'creation_date': '2023-12-18', 'last_modified_date': '2023-12-18', 'last_accessed_date': '2023-12-18'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='e354e83b-ea26-4b2e-ba16-128f5fcc5b13', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'file_path': 'data/DPDP/01.txt', 'file_name': '01.txt', 'file_type': 'text/plain', 'file_size': 42776, 'creation_date': '2023-12-18', 'last_modified_date': '2023-12-18', 'last_accessed_date': '2023-12-18'}, hash='e2f4dbf6b671b44d39fe9138f89abac2fe05b6e135d51d282f6a17160e929d4b'), <NodeRelation

In [33]:
chunks = [x.text for x in nodes]
chunks

['Understanding India’s New Data Protection Law\nANIRUDH BURMAN\nOCTOBER 03, 2023\nPAPER\nSource: Getty\nSummary:  In early August 2023, the Indian Parliament passed the Digital Personal Data Protection (DPDP) Act, 2023. This working paper analyzes the law and evaluates its development over more than half a decade of deliberations.\nRelated Media and Tools\nPrint Page\nINTRODUCTION\nIn early August 2023, the Indian Parliament passed the Digital Personal Data Protection (DPDP) Act, 2023.1 The new law is the first cross-sectoral law on personal data protection in India and has been enacted after more than half a decade of deliberations.2 The key question this paper discusses is whether this seemingly interminable period of deliberations resulted in a “good” law—whether the law protects personal data adequately, and in addition, whether it properly balances, as the preamble to the law states, “the right of individuals to protect their personal data” on one hand and “the need to process su

In [47]:
import logging
logging.basicConfig(level=logging.INFO)
triplets = []
for chunk in chunks:
    resp = llm.complete(prompt.replace("{text}", chunk))
    triplets.extend(resp.text.split("\n"))
triplets

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


['(DPDP Act, passed by, Indian Parliament)',
 '(DPDP Act, first cross-sectoral law on, personal data protection in India)',
 '(DPDP Act, enacted after, more than half a decade of deliberations)',
 '(DPDP Act, protects, personal data adequately)',
 '(DPDP Act, balances, the right of individuals to protect their personal data and the need to process such personal data for lawful purposes)',
 '(regulatory structure, based on, 2018 draft bill proposed by the Srikrishna Committee)',
 '(2019 bill, exempted, certain entities and businesses from notice and consent requirements)',
 '(2019 bill, empowered, government to regulate nonpersonal data)',
 '(2019 bill, proposed, comprehensive cross-sectoral framework based on preventive requirements for businesses and rights for individuals or consumers)',
 '(DPDP Act, based on, draft proposed by the government in November 2022)',
 "(entity processing data, can do so, by taking the concerned individual's consent)",
 '(entity processing data, can do so,

In [49]:
# Assuming triplets is a list of strings like '(DPDP Act, passed by, Indian Parliament)'
new_triplets = [t.replace('(', '').replace(')', '') for t in triplets]  # Remove parentheses
new_triplets = [tuple(map(str.strip, t.split(','))) for t in new_triplets]  # Split by comma and convert to tuple
new_triplets

[('DPDP Act', 'passed by', 'Indian Parliament'),
 ('DPDP Act',
  'first cross-sectoral law on',
  'personal data protection in India'),
 ('DPDP Act', 'enacted after', 'more than half a decade of deliberations'),
 ('DPDP Act', 'protects', 'personal data adequately'),
 ('DPDP Act',
  'balances',
  'the right of individuals to protect their personal data and the need to process such personal data for lawful purposes'),
 ('regulatory structure',
  'based on',
  '2018 draft bill proposed by the Srikrishna Committee'),
 ('2019 bill',
  'exempted',
  'certain entities and businesses from notice and consent requirements'),
 ('2019 bill', 'empowered', 'government to regulate nonpersonal data'),
 ('2019 bill',
  'proposed',
  'comprehensive cross-sectoral framework based on preventive requirements for businesses and rights for individuals or consumers'),
 ('DPDP Act', 'based on', 'draft proposed by the government in November 2022'),
 ('entity processing data',
  'can do so',
  "by taking the con

In [56]:
# _unique_nodes = [{
#     'id': "->".join(n),
#     'text': f'Subject: Digital Privacy Data Protection (DPDP) Act \nSub-Topic: {"->".join(n)}'
#     } for n in new_triplets]
# _unique_nodes
_unique_nodes = [{
    'id': "->".join(n),
    'text': f'{"->".join(n)}'
    } for n in new_triplets]
_unique_nodes

[{'id': 'DPDP Act->passed by->Indian Parliament',
  'text': 'DPDP Act->passed by->Indian Parliament'},
 {'id': 'DPDP Act->first cross-sectoral law on->personal data protection in India',
  'text': 'DPDP Act->first cross-sectoral law on->personal data protection in India'},
 {'id': 'DPDP Act->enacted after->more than half a decade of deliberations',
  'text': 'DPDP Act->enacted after->more than half a decade of deliberations'},
 {'id': 'DPDP Act->protects->personal data adequately',
  'text': 'DPDP Act->protects->personal data adequately'},
 {'id': 'DPDP Act->balances->the right of individuals to protect their personal data and the need to process such personal data for lawful purposes',
  'text': 'DPDP Act->balances->the right of individuals to protect their personal data and the need to process such personal data for lawful purposes'},
 {'id': 'regulatory structure->based on->2018 draft bill proposed by the Srikrishna Committee',
  'text': 'regulatory structure->based on->2018 draft b

In [57]:
topic_collection.add(
    documents=[x['text'] for x in _unique_nodes],
    metadatas=[{
        'topic': topic,
    } for x in _unique_nodes],
    ids=[x['id'] for x in _unique_nodes]
)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


In [58]:
def _query(query, n_results=10):
    return topic_collection.query(query_texts=[query], n_results=n_results)['ids'][0]

In [61]:
_query('What do businesses/fiduciaries need to do to ensure compliance?')

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


['consumers->will exercise->their rights against data fiduciaries',
 'entities responsible for collecting->storing->and processing digital personal data->defined as->data fiduciaries',
 'Section 37 of the law->enables->central government to block access to any information that can be communicated by a data fiduciary',
 'businesses->will inform->consumers and the DPB about data breaches',
 'DPB->can pass orders->issuing penalties or imposing voluntary settlements for noncompliance with the law',
 'DPB->has limited powers->ensuring remedial actions against any data breaches and issuing directions to businesses requiring them to comply with the law',
 'board->should create->certain checks and balances for issuing directions',
 'DPDP Act->requires->board to observe certain specified procedural rules while conducting inquiries and issuing penalties',
 'board->should provide->regulated entity with a formal opportunity to furnish their response to a draft direction before such a direction is 

In [70]:
PROMPT_TO_GENERATE_RESPONSES = (
    "Some information on <<topic>> is provided below. The information is of the form (parent_topic, relation_type, topic). "
    "Answer the user query using the information provided.\n"
    "---------------------\n"
    "<<triplets>>\n"
    "---------------------\n"
    "User query: <<query>>\n"
    "Response: "
)
print(PROMPT_TO_GENERATE_RESPONSES)

Some information on <<topic>> is provided below. The information is of the form (parent_topic, relation_type, topic). Answer the user query using the information provided.
---------------------
<<triplets>>
---------------------
User query: <<query>>
Response: 


In [73]:
user_query = 'List all the atomic things businesses/feduciaries need to do to ensure compliance.'
triplets_prompt = "\n".join(_query(user_query, 40))
prompt = PROMPT_TO_GENERATE_RESPONSES.replace("<<topic>>", topic).replace("<<triplets>>", triplets_prompt).replace("<<query>>", user_query)
resp = llm.complete(prompt).text
print(resp)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
- Businesses need to inform consumers and the DPB about data breaches.
- Businesses need to comply with the law and follow the directions issued by the DPB.
- Businesses need to collect, store, and process digital personal data as defined by the DPB.
- Businesses need to provide a formal opportunity for regulated entities to respond to draft directions before they are formally issued.
- Businesses need to ensure remedial actions against any data breaches.
- Businesses need to pass orders, such as issuing penalties or imposing voluntary settlements, for noncompliance with the law.


In [134]:
def ask_query(query, n_results=30):
    triplets_prompt = "\n".join(_query(query, n_results))
    prompt = PROMPT_TO_GENERATE_RESPONSES.replace("<<topic>>", topic).replace("<<triplets>>", triplets_prompt).replace("<<query>>", query)
    resp = llm.complete(prompt).text
    return resp

In [75]:
print(ask_query('what is DPB vs DPDP?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
DPB refers to the Data Protection Bill, which is a proposed legislation that establishes the DPB (Data Protection Board) as a regulatory entity. On the other hand, DPDP refers to the Data Protection and Privacy Act, which is an enacted law that serves as the first cross-sectoral law on personal data protection in India. The DPDP Act replaces the idea of an independent regulator like the DPA (Data Protection Authority) and creates a regulatory structure based on the 2018 draft bill proposed by the Srikrishna Committee.


In [76]:
print(ask_query('Isn\'t DPB unnecessary once DPDP is passed?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
No, the DPB (Data Protection Board) is not unnecessary once the DPDP (Data Protection and Privacy Act) is passed. The DPB is established by the 2023 law as a regulatory entity and has limited powers to ensure remedial actions against data breaches and issue directions to businesses for compliance with the law. It also has the authority to pass orders and impose penalties for noncompliance. Therefore, the DPB plays a crucial role in enforcing the provisions of the DPDP Act and ensuring data protection and privacy.


In [79]:
print(ask_query('I am writing an article on DPDP Act listing the top 5 things that businesses need to do to ensure compliance. Can you help me with that?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Based on the information provided, here are the top 5 things that businesses need to do to ensure compliance with the DPDP Act:

1. Inform Consumers and DPB about Data Breaches: Businesses should promptly inform both consumers and the Data Protection Board (DPB) about any data breaches that occur.

2. Comply with DPB Directions: The DPB has the power to issue directions to businesses requiring them to comply with the law. Businesses should ensure they follow these directions and take the necessary remedial actions.

3. Observe Procedural Rules: The DPDP Act requires the DPB to observe certain specified procedural rules while conducting inquiries and issuing penalties. Businesses should ensure they are aware of and comply with these rules.

4. Create Checks and Balances: The board should create certain checks and 

In [80]:
# Inform Consumers and DPB about Data Breaches
print(ask_query('tell me more about DPDP aspect of informing consumers and DPB about data breaches'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
According to the information provided, businesses are responsible for informing consumers and the DPB (Data Protection Board) about data breaches. This means that if a business experiences a data breach, they are required to notify both the affected consumers and the DPB. This is an important aspect of the DPDP (Data Protection and Privacy Act) as it ensures that consumers are made aware of any breaches that may have compromised their personal data.


In [81]:
# Comply with DPB Directions
print(ask_query('What are the DPB directions that businesses need to comply with?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The DPB directions that businesses need to comply with include ensuring remedial actions against any data breaches and complying with the law.


In [82]:
# Observe Procedural Rules
print(ask_query('What are the procedural rules that businesses need to observe?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Businesses need to observe certain specified procedural rules while conducting inquiries and issuing penalties as required by the DPDP Act.


In [83]:
# Create Checks and Balances
print(ask_query('What are the checks and balances that businesses need to create?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Businesses need to create certain checks and balances for issuing directions, as stated in the information provided.


In [85]:
requirements = "info on specific actions that the data fiduciaries need to take to ensure compliance - include DPB directions, procedural rules, checks and balances, etc."
# Extract more triplets
prompt = PROMPT_TO_EXTRACT_TRIPLETS.replace("<<topic>>", topic).replace("<<requirement>>", requirements).replace("<<FIRST_CHUNK_EX>>", first_chunk).replace("<<FIRST_CHUNK_TRIPLETS>>", first_triplets).replace("<<SECOND_CHUNK_EX>>", second_chunk).replace("<<SECOND_CHUNK_TRIPLETS>>", second_triplets).replace("<<num_triplets>>", str(num_triplets))
print(prompt)

Some text is provided below related to DPDP. Given the text, to better understand info on specific actions that the data fiduciaries need to take to ensure compliance - incoude DPB directions, procedural rules, checks and balances, etc., extract up to 5 knowledge triplets in the form of (parent_topic, relation_type, topic). Avoid stopwords.
---------------------
Example:Text: The bill grouped personal data into different categories and required elevated levels of protection for “sensitive” and “critical” personal data. Certain businesses were also to be categorized as “significant data fiduciaries,” and additional obligations were proposed for them—registration in India, data audits, and data impact assessments.Triplets:
(personal data, grouped into, different categories - sensitive and critical)
(certain businesses, to be categorized as, significant data fiduciaries)
(significant data fiduciaries, to require, additional obligations - registration in India, data audits, and data impact

In [86]:
for chunk in chunks:
    resp = llm.complete(prompt.replace("{text}", chunk))
    triplets.extend(resp.text.split("\n"))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


In [92]:
new_triplets = set(t.replace('(', '').replace(')', '') for t in triplets)  # Remove parentheses
new_triplets = [tuple(map(str.strip, t.split(','))) for t in new_triplets]  # Split by comma and convert to tuple
new_triplets

[('board',
  'should provide',
  'regulated entity with a formal opportunity to furnish their response to a draft direction before such a direction is formally issued to them'),
 ('entity processing data',
  'can do so',
  "by taking the concerned individual's consent"),
 ('reductions in rights and obligations',
  'have been recast to - right to "erasure"'),
 ("DPB's limited mandate",
  'will create less frequent touchpoints with the economy'),
 ('2023 law',
  'states that',
  'government may restrict data flows to certain countries by notification'),
 ('central government', 'framed', 'rules to implement the law'),
 ('significant data fiduciaries',
  'additional obligations',
  'appointing a data protection officer based in India who will be answerable to the board of directors or the governing body of the SDF and will also serve as the point of contact for grievance redressal'),
 ('right to be forgotten', 'recast to', 'right to "erasure"'),
 ('exceptions carved out for consent', 'empo

In [93]:
_unique_nodes = [{
    'id': "->".join(n),
    'text': f'{"->".join(n)}'
    } for n in new_triplets]
_unique_nodes, len(_unique_nodes)

([{'id': 'board->should provide->regulated entity with a formal opportunity to furnish their response to a draft direction before such a direction is formally issued to them',
   'text': 'board->should provide->regulated entity with a formal opportunity to furnish their response to a draft direction before such a direction is formally issued to them'},
  {'id': "entity processing data->can do so->by taking the concerned individual's consent",
   'text': "entity processing data->can do so->by taking the concerned individual's consent"},
  {'id': 'reductions in rights and obligations->have been recast to - right to "erasure"',
   'text': 'reductions in rights and obligations->have been recast to - right to "erasure"'},
  {'id': "DPB's limited mandate->will create less frequent touchpoints with the economy",
   'text': "DPB's limited mandate->will create less frequent touchpoints with the economy"},
  {'id': '2023 law->states that->government may restrict data flows to certain countries b

In [94]:
topic_collection.add(
    documents=[x['text'] for x in _unique_nodes],
    metadatas=[{
        'topic': topic,
    } for x in _unique_nodes],
    ids=[x['id'] for x in _unique_nodes]
)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


In [95]:
topic_collection.count()

87

In [96]:
print(ask_query('What are the procedural rules that businesses need to observe?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The procedural rules that businesses need to observe include the manner in which notices will be given to consumers, the manner in which businesses will inform their consumers and the DPB about data breaches, and the manner in which consent managers will function.


In [97]:
print(ask_query('What are the DPB directions that businesses need to comply with?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The DPB (Data Protection Board) can issue directions to businesses requiring them to comply with the law. The specific directions that businesses need to comply with are not mentioned in the provided information.


In [98]:
print(ask_query('What are the checks and balances that businesses need to create?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Businesses need to create certain checks and balances for issuing directions, as stated in the information provided.


In [99]:
print(ask_query('What is DPB?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
DPB stands for Data Protection Board. It is a regulatory entity established by the 2023 law. The DPB has limited powers, including issuing directions to businesses requiring them to comply with the law, ensuring remedial actions against any data breaches, and passing orders issuing penalties or imposing voluntary settlements for noncompliance with the law.


In [100]:
print(ask_query('Obligations on Data Fiduciaries'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The obligations on data fiduciaries include maintaining security safeguards, ensuring the completeness, accuracy, and consistency of personal data, and intimating data breaches to the Data Protection Board of India (DPB) in a prescribed manner. Additionally, significant data fiduciaries have the additional obligation of appointing a data protection officer based in India who will be answerable to the board of directors or the governing body of the organization and will also serve as the point of contact for grievance redressal.


In [101]:
print(ask_query('What is the role of the DPB?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The role of the DPB (Data Protection Board) is to issue directions to businesses requiring them to comply with the law, ensure remedial actions against any data breaches, and pass orders issuing penalties or imposing voluntary settlements for noncompliance with the law. The DPB's powers are limited to these functions, and it is established as a regulatory entity under the 2023 law.


In [102]:
print(ask_query('List the categories DPDP defines?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The DPDP defines the following categories:

1. Data fiduciaries: Entities responsible for collecting, storing, and processing digital personal data.

2. Consumers: Individuals who have rights against data fiduciaries.

3. Individuals: The DPDP Act creates rights and obligations for individuals.

4. Businesses: The DPB can pass orders and issue penalties or impose voluntary settlements for noncompliance with the law.

5. Regulatory entity: The 2023 law establishes the DPB as a regulatory entity.

6. Purposes and entities exempted: The 2023 law exempts certain purposes and entities completely from its purview.

Please note that these categories are based on the information provided and may not be exhaustive.


In [105]:
print(ask_query('What are the different aspects of DPDP?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The different aspects of DPDP include:
- The DPDP Act is the first cross-sectoral law on personal data protection in India.
- The Act requires the DPB (Data Protection Board) to observe certain specified procedural rules while conducting inquiries and issuing penalties.
- The DPB has limited powers, including ensuring remedial actions against data breaches and issuing directions to businesses to comply with the law.
- The DPDP Act creates rights and obligations for individuals, balancing their right to protect personal data with the need for lawful processing.
- The Act does away with the idea of an independent regulator like the DPA (Data Protection Authority).
- The 2018 and 2019 versions of the bill adopted an expansive and all-encompassing framework toward data protection.
- The 2018 and 2019 drafts included 

In [106]:
topic = "DPDP-chunks"
desc = {
    'fullform': 'Digital Privacy Data Protection Act; but with chunks',
}
topic_2_collection = chroma_client.get_or_create_collection(name=topic, embedding_function=emb_fn, metadata={"hnsw:space": "cosine"})

In [107]:
chunks

['Understanding India’s New Data Protection Law\nANIRUDH BURMAN\nOCTOBER 03, 2023\nPAPER\nSource: Getty\nSummary:  In early August 2023, the Indian Parliament passed the Digital Personal Data Protection (DPDP) Act, 2023. This working paper analyzes the law and evaluates its development over more than half a decade of deliberations.\nRelated Media and Tools\nPrint Page\nINTRODUCTION\nIn early August 2023, the Indian Parliament passed the Digital Personal Data Protection (DPDP) Act, 2023.1 The new law is the first cross-sectoral law on personal data protection in India and has been enacted after more than half a decade of deliberations.2 The key question this paper discusses is whether this seemingly interminable period of deliberations resulted in a “good” law—whether the law protects personal data adequately, and in addition, whether it properly balances, as the preamble to the law states, “the right of individuals to protect their personal data” on one hand and “the need to process su

In [109]:
topic_2_collection.add(
    documents=chunks,
    metadatas=[{
        'topic': topic,
    } for x in chunks],
    ids=[f'{i}' for i in range(len(chunks))]
)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


In [110]:
PROMPT_FOR_TOP_K = (
    "Some information on <<topic>> is provided below. The information is of the form of excerpts from an article. "
    "Answer the user query using the information provided.\n"
    "---------------------\n"
    "<<chunks>>\n"
    "---------------------\n"
    "User query: <<query>>\n"
    "Response: "
)
print(PROMPT_FOR_TOP_K)

Some information on <<topic>> is provided below. The information is of the form of excerpts from an article. Answer the user query using the information provided.
---------------------
<<chunks>>
---------------------
User query: <<query>>
Response: 


In [119]:
def _query2(query, n_results=2):
    return topic_2_collection.query(query_texts=[query], n_results=n_results)['documents'][0]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


['In addition to monetary penalties, the bill allows data fiduciaries to provide voluntary undertakings to the board as a form of settlement of any complaints against them.26 Therefore, the board is a very different institution in design compared to the DPA.\n\nFinally, the 2023 law contains a novel provision not included or discussed in any previous version. This is Section 37, which allows the government, based on a reference from the board, to block the public’s access to any information that enables a data fiduciary to provide goods or services in India. This has to be based on two criteria: (a) the board has imposed penalties against such data fiduciaries on two or more prior occasions, and (b) the board has recommended a blockage. The government has to provide the data fiduciary an opportunity to be heard before taking such action.\n\nANALYZING THE DPDP ACT, 2023\nThis section analyzes the 2023 act from two perspectives. First, it explains the broad structure of the law and highl

In [138]:
def ask_query2(query, n_results=2):
    chunks = "\n".join(_query2(query, n_results))
    prompt = PROMPT_FOR_TOP_K.replace("<<topic>>", topic).replace("<<chunks>>", chunks).replace("<<query>>", query)
    resp = llm.complete(prompt).text
    return resp

In [143]:
print(ask_query('What are the different aspects of DPDP?, write it in a simple language to describe in an article'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The DPDP (Data Protection and Privacy) Act is a law in India that aims to protect personal data. It is the first law in India that covers personal data protection across different sectors. The Act does away with the idea of having an independent regulator and instead establishes the DPB (Data Protection Board) as a regulatory entity. The DPB has the power to issue penalties or impose settlements for noncompliance with the law. The Act also requires the board to follow certain procedural rules when conducting inquiries and issuing penalties. 

The 2018 and 2019 drafts of the Act included provisions that were not directly related to data privacy. However, the 2019 bill proposed a comprehensive framework for data protection, focusing on preventive requirements for businesses and rights for individuals. The regulator

In [124]:
print(ask_query2('What are the different aspects of DPDP?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The different aspects of DPDP (Data Protection and Privacy) are as follows:

1. Regulatory Powers: The DPB (Data Protection Board) under the DPDP Act has limited regulation-making powers. Its main role is to ensure remedial actions against data breaches and issue directions to businesses for compliance with the law. It can also pass orders for penalties or voluntary settlements for noncompliance.

2. Shift in Approach: The DPDP Act represents a major shift in approach compared to previous versions of the law. It does away with the idea of an independent regulator like the DPA (Data Protection Authority) and focuses more on remedial actions and compliance.

3. Incremental Shifts: The development of the DPDP Act has occurred incrementally over the years. The 2018 and 2019 versions proposed an expansive law based on

In [147]:
print(ask_query('What happens on non-compliance?', 30))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
On non-compliance, the DPB (Data Protection Board) can pass orders issuing penalties or imposing voluntary settlements for noncompliance with the law. The DPA (Data Protection Authority) would have had powers to impose penalties for noncompliance. The DPB's powers are limited to ensuring remedial actions against any data breaches and issuing directions to businesses requiring them to comply with the law. The DPDP Act also requires the board to observe certain specified procedural rules while conducting inquiries and issuing penalties.


In [142]:
print(ask_query2('List the categories DPDP defines?'))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The provided information does not mention any specific categories defined by DPDP (Data Protection and Privacy Act).
