---
**llama-index with chroma-db** 

---

### import dependencies and env variables

In [1]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext, Settings
from llama_index.core.node_parser import SentenceSplitter
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb

# to load open ai key
from dotenv import load_dotenv
import os

### env variables and constants

In [2]:
load_dotenv()
os.environ['OPENAI_API_KEY']=os.getenv("OPENAI_API_KEY")

DATA_PATH = '../data'
VECTOR_STORE_PATH = '../chroma_db'

CHUNK_SIZE=512
CHUNK_OVERLAP=10

Python-dotenv could not parse statement starting at line 2


### Logging

In [3]:
#import logging
#import sys
#logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

### RAG pipeline

#### Loading

In [3]:
documents = SimpleDirectoryReader(DATA_PATH).load_data()

In [4]:
documents

[Document(id_='4598cc2d-494f-45b0-85c6-a99820cb306a', embedding=None, metadata={'page_label': '1', 'file_name': 'Maroc_Telecom-Financial_Report_2023.pdf', 'file_path': '/home/cuphead/Projects/llama-index/notebooks/../data/Maroc_Telecom-Financial_Report_2023.pdf', 'file_type': 'application/pdf', 'file_size': 4290606, 'creation_date': '2024-08-07', 'last_modified_date': '2024-08-07'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text=' \n   \nFINANCIAL  REPORT  \n2023  \n', mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'),
 Document(id_='a2a1328f-02ff-4de6-bca9-ee710abf9b58', embedding=None, metadata={'page_label': '2', 'file_name': 'Maroc_T

##### create client and a new collection for chroma db

In [10]:
db = chromadb.PersistentClient(path=VECTOR_STORE_PATH)
chroma_collection = db.get_or_create_collection("quickstart")

#### set up ChromaVectorStore

In [12]:
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

#### indexing and saving to disk

In [13]:
text_splitter = SentenceSplitter(chunk_size=CHUNK_SIZE, chunk_overlap=CHUNK_OVERLAP)

Settings.text_splitter = text_splitter

# after we pass storage_context, chroma automatically saves data to disk
index = VectorStoreIndex.from_documents(
    documents,
    transformations=[text_splitter],
    storage_context=storage_context,
    show_progress=True
) 

  from .autonotebook import tqdm as notebook_tqdm
Parsing nodes: 100%|██████████| 994/994 [00:02<00:00, 394.65it/s]
Generating embeddings: 100%|██████████| 2048/2048 [00:42<00:00, 48.23it/s]
Generating embeddings: 100%|██████████| 2/2 [00:01<00:00,  1.57it/s]


#### load from disk

In [14]:
db2 = chromadb.PersistentClient(path=VECTOR_STORE_PATH)
chroma_collection = db2.get_or_create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
loaded_index = VectorStoreIndex.from_vector_store(
    vector_store,
)

Here are five questions based on the scope of consolidation for Maroc Telecom:

1. **What is the percentage of voting rights Maroc Telecom holds in Mauritel, and how does this translate into their equity interest in the Mauritanian operator?**
   - Maroc Telecom holds 52% of the voting rights in Mauritel, which translates to a 41.2% equity interest in the Mauritanian operator due to their ownership of 80% in Compagnie Mauritanienne de Communications (CMC), the owner of Mauritel.

2. **When did Maroc Telecom fully consolidate Gabon Telecom, and what significant acquisition did Gabon Telecom make in 2016?**
   - Maroc Telecom fully consolidated Gabon Telecom on March 1, 2007. In 2016, Gabon Telecom acquired 100% of Atlantique Telecom Gabon’s capital, which was subsequently absorbed by Gabon Telecom on June 29, 2016.

3. **Which entities did Maroc Telecom acquire on January 26, 2015, and what was the consolidation status of these entities?**
   - On January 26, 2015, Maroc Telecom acquired Moov Africa Côte d'Ivoire (85% stake), Moov Africa Benin (100% stake), Moov Africa Togo (95% stake), Moov Africa Niger (100% stake), and Moov Africa Centrafrique (100% stake). All these entities have been fully consolidated in the financial statements of Maroc Telecom since January 31, 2015.

4. **When did Maroc Telecom acquire a 100% stake in Moov Africa Chad, and what is the consolidation status of this acquisition?**
   - Maroc Telecom acquired a 100% stake in Moov Africa Chad on June 26, 2019, and this entity has been fully consolidated in Maroc Telecom's financial statements since July 1, 2019.

5. **What type of investments are classified as "Other nonconsolidated investments" by Maroc Telecom, and can you name a few examples?**
   - "Other nonconsolidated investments" are those investments whose significance in relation to the consolidated financial statements is not material, or where Maroc Telecom does not exercise exclusive control, joint control, or significant influence. Examples include MT Cash, MT Fly, and minority interests in RASCOM, Autoroutes du Maroc, Arabsat, and other investments.

#### Query Data from the persisted index

In [15]:
question="What is the percentage of voting rights Maroc Telecom holds in Mauritel, and how does this translate into their equity interest in the Mauritanian operator?"
query_engine = loaded_index.as_query_engine()
response = query_engine.query(question)
print(response)

Maroc Telecom holds 52% of the voting rights in Mauritel, which translates to a 41.2% equity interest in the Mauritanian operator.


In [16]:
response.metadata

{'88b876fe-0e6e-4b7b-a3c0-e8a6b2691f65': {'page_label': '24',
  'file_name': 'Maroc_Telecom_2024_half_year_Financial_Report.pdf',
  'file_path': '/home/cuphead/Projects/llama-index/notebooks/../data/Maroc_Telecom_2024_half_year_Financial_Report.pdf',
  'file_type': 'application/pdf',
  'file_size': 3670685,
  'creation_date': '2024-08-07',
  'last_modified_date': '2024-08-07'},
 '046dfe12-2fcb-4af1-bf38-2118c936d560': {'page_label': '24',
  'file_name': 'Maroc_Telecom_2023_half_year_Financial_Report.pdf',
  'file_path': '/home/cuphead/Projects/llama-index/notebooks/../data/Maroc_Telecom_2023_half_year_Financial_Report.pdf',
  'file_type': 'application/pdf',
  'file_size': 3488304,
  'creation_date': '2024-08-07',
  'last_modified_date': '2024-08-07'}}

asking in french

In [17]:
question2="Quand Maroc Telecom a-t-il complètement intégré Gabon Telecom, et quelle acquisition significative Gabon Telecom a-t-il réalisée en 2016 ?"
query_engine = loaded_index.as_query_engine()
response = query_engine.query(question2)
print(response)

Maroc Telecom a complètement intégré Gabon Telecom le 1er mars 2007. En 2016, Gabon Telecom a réalisé l'acquisition significative de la filiale Atlantique Telecom Gabon à Maroc Telecom, absorbée par Gabon Telecom le 29 juin 2016.


#### adding new documents

In [19]:
new_documents=SimpleDirectoryReader("../delta_data").load_data()

In [21]:
for doc in new_documents:
    index.insert(doc)

Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 441.00it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 653.93it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 628.27it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 259.26it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 193.82it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 550.14it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 170.97it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 1198.03it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 119.71it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 340.06it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 102.78it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 678.80it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 75.39it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 955.86it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 342.53it/s]
Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 324.44it/s]
Parsing 

asking about new documents

1. **Quelle a été l'évolution des revenus du Groupe Maroc Telecom en 2019 par rapport à 2018 ?**
   
   En 2019, le Groupe Maroc Telecom a généré des revenus totaux de 36 517 millions de MAD, ce qui représente une augmentation de 1,3% par rapport à 2018 (+0,9% à périmètre comparable). Cette performance est due à la croissance continue des activités au Maroc et à la résilience des activités internationales malgré la concurrence accrue et la pression réglementaire.

2. **Quel a été l'impact de la gestion des coûts sur l'EBITDA du Groupe Maroc Telecom en 2019 ?**
   
   Grâce à une gestion robuste des coûts, l'EBITDA du Groupe Maroc Telecom a atteint 18 922 millions de MAD en 2019, en hausse de 3,4% à périmètre comparable. La marge EBITDA a atteint 51,8%, en augmentation de 1,2 point à périmètre comparable.

3. **Quel est le montant des investissements réalisés par le Groupe Maroc Telecom en 2019 et comment se compare-t-il aux revenus ?**
   
   En 2019, les investissements en capital ont atteint 6 788 millions de MAD, ce qui représente une augmentation de 2,2% par rapport à l'année précédente. Ces investissements représentent 14,7% des revenus (hors fréquences et licences), ce qui est en ligne avec l'objectif déclaré pour l'année.

In [23]:
question3 = "Quelle a été l'évolution des revenus du Groupe Maroc Telecom en 2019 par rapport à 2018 ?"
query_engine = index.as_query_engine()
response = query_engine.query(question3)
print(response)

Les revenus du Groupe Maroc Telecom ont augmenté de 1,3% en 2019 par rapport à 2018.


In [24]:
response.metadata

{'67f9f414-5d36-49ec-b4d9-9a6709942bb9': {'page_label': '18',
  'file_name': 'Rapport_financier_2020.pdf',
  'file_path': '/home/cuphead/Projects/llama-index/notebooks/../data/Rapport_financier_2020.pdf',
  'file_type': 'application/pdf',
  'file_size': 3556301,
  'creation_date': '2024-08-07',
  'last_modified_date': '2024-08-07'},
 'aaa207d2-a93d-45be-b8aa-c68ba57b7292': {'page_label': '18',
  'file_name': 'Rapport_financier_2020.pdf',
  'file_path': '/home/cuphead/Projects/llama-index/notebooks/../delta_data/Rapport_financier_2020.pdf',
  'file_type': 'application/pdf',
  'file_size': 3556301,
  'creation_date': '2024-08-07',
  'last_modified_date': '2024-08-07'}}