# Reading and writing from streamlit to MongoDB

## 1. these are the collections

```JS
async init() {
    this.client = new MongoClient(this.uri);
    await this.client.connect();
    this.db = this.client.db(DB);
    this.companiesCollection = this.db.collection('companies');
    this.usersCollection = this.db.collection('users');
    this.fundsCollection = this.db.collection('funds');
    this.exchangeCollection = this.db.collection('exchange');
    this.portfoliosCollection = this.db.collection('portfolios');
    this.frameworksCollection = this.db.collection('frameworks');
    this.metricMetaCollection = this.db.collection('metricMeta');
    this.metricValuesCollection = this.db.collection('metricValues');

const DB_URI = os.getenv('MONGO_URI');
const DB = 'General';
```

See: C:\Users\johan\Documents\GitHub\Supernova-PE\src\core\db.service.ts

The `db.service` defines actions (like companiesCollection).

## 2. these are the fields
   
The fields of the documents of the collections are spread out. For example here the metricValues are set:
 
```js
async setMetricValue(metricMetaID) {
  console.log('set metric value', this.metricValues[metricMetaID]);
  let obj: DbMetricValue = {
    company_link: this.company._id,
    value: this.metricValues[metricMetaID].value,
    reporting_period: this.request.reporting_period,
    metric_meta_link: metricMetaID,
    is_public: true
  }
  if (this.metricValues[metricMetaID]._id) {
    await this.api.put('admin/metric-value', this.metricValues[metricMetaID]._id, {value: obj.value})
  } else {
    await this.api.post('admin/metric-value', obj)
  }
  this.getData(this.requestID);
}
```

See: C:\Users\johan\Documents\GitHub\Supernova-PE\angular-src\app\projects\admin\requests-item\requests-item.component.ts

## 3. schemas

The schema is defined in C:\Users\johan\Documents\GitHub\Supernova-PE\src\shared\interfaces

Typically this is done in [mongoose](https://www.mongodb.com/developer/languages/javascript/getting-started-with-mongodb-and-mongoose/).

TODO:
- Use the connection string to connect a python script, inspect a collection, document, and its fields. 
- Add and remove a document to see if that works.
- connect streamlit

In [1]:
from pymongo import MongoClient
import pandas as pd

In [51]:
uri = os.getenv('MONGO_URI')
client = MongoClient(uri)

# database code goes here
db = client.General

In [18]:
db.list_collection_names()

['users',
 'companies',
 'metricValues',
 'exchange',
 'funds',
 'metricMeta',
 'portfolios']

In [60]:
db.exchange.find_one()

{'_id': ObjectId('64632e73ee98577a8b286b3e'),
 'title': 'Generic request',
 'portfolio_link': '644c347fc69caac958b26d34',
 'fund_link': '6447ef3b63bfe2715b407dd3',
 'companies': [{'company_link': '64632e51ee98577a8b286b3c',
   'requested_metrics': ['6449b2faa2927cd3aa323dcb',
    '6449b304a2927cd3aa323dcc',
    '6449b3c1a2927cd3aa323dda',
    '6449b3caa2927cd3aa323ddb',
    '6449b2b7a2927cd3aa323dc8']},
  {'company_link': '64632e03ee98577a8b286b3a',
   'requested_metrics': ['6449b2faa2927cd3aa323dcb',
    '6449b304a2927cd3aa323dcc',
    '6449b3c1a2927cd3aa323dda',
    '6449b3caa2927cd3aa323ddb',
    '6449b2b7a2927cd3aa323dc8',
    '6449b348a2927cd3aa323dd3']},
  {'company_link': '64632be0ee98577a8b286b2a',
   'requested_metrics': ['6449b2faa2927cd3aa323dcb',
    '6449b304a2927cd3aa323dcc',
    '6449b3c1a2927cd3aa323dda',
    '6449b3caa2927cd3aa323ddb',
    '6449b2b7a2927cd3aa323dc8',
    '6449b348a2927cd3aa323dd3']},
  {'company_link': '6447eff42a81aa1fc0aa7e30',
   'requested_metrics'

In [53]:
users = []
for doc in db.companies.find():
    users.append(doc)
    
users

[{'_id': ObjectId('6447eff42a81aa1fc0aa7e30'),
  'title': 'Pacific Carbon Capture',
  'industry': 'Apps',
  'revenue_category': '> 50m',
  'is_publicly_available': 'true',
  'is_invited': True,
  'invited_by': ['6462e5e236b2f0c087c5310e',
   '6462e92136b2f0c087c5310f',
   '64632e73ee98577a8b286b3e',
   '6492e5e7e3c1595c08087672',
   '6492f331f6d3e3bbc9f0639c',
   '64930745dae03dd14ac2b197',
   '64942550c30a80751b8edcd1',
   '6495747cdb62fd5ce34f6e9e',
   '649d1fb3cb7292a2bac89c40',
   '649d321abb31ea744f5672fd',
   '649e929fbd4b0417b817e81a'],
  'general_notes': 'very good company',
  'metric_notes': '2332'},
 {'_id': ObjectId('64632be0ee98577a8b286b2a'),
  'title': 'Carbonfuture',
  'description': 'Climate crisis fighters',
  'revenue_category': '10-50m',
  'industry': 'Biotechnology',
  'is_publicly_available': True,
  'invited_by': ['64632e73ee98577a8b286b3e', '64afb239c6d3b0ed2693dfc2']},
 {'_id': ObjectId('64632e03ee98577a8b286b3a'),
  'title': 'Equalture',
  'description': 'Shapi

## Adding a collection and a document inside it

The collection is created when you add a document

In [9]:
import datetime
post = {
    "author": "Mike",
    "text": "My first blog post!",
    "tags": ["mongodb", "python", "pymongo"],
    "date": datetime.datetime.now(tz=datetime.timezone.utc),
}

test = db.test
test_id = test.insert_one(post).inserted_id
test_id

ObjectId('64ca8d26c7e45b4cc5e662d7')

In [10]:
db.list_collection_names()

['users',
 'companies',
 'metricValues',
 'test',
 'exchange',
 'funds',
 'metricMeta',
 'portfolios']

In [13]:
for f in db.test.find({}):
    print(f)

{'_id': ObjectId('64ca8d26c7e45b4cc5e662d7'), 'author': 'Mike', 'text': 'My first blog post!', 'tags': ['mongodb', 'python', 'pymongo'], 'date': datetime.datetime(2023, 8, 2, 17, 6, 46, 112000)}


In [14]:
db.test.drop()

In [15]:
db.list_collection_names()

['users',
 'companies',
 'metricValues',
 'exchange',
 'funds',
 'metricMeta',
 'portfolios']

## Adding a date field based on the id

In [37]:
db.users.find_one()['_id']

ObjectId('6447ef3b63bfe2715b407dd4')

In [36]:
from bson.objectid import ObjectId
from datetime import datetime

users = []
for doc in db.users.find():
    id_string = doc['_id']
    
    # Convert string to ObjectId
    id_object = ObjectId(id_string)
    
    # Extract the timestamp and convert it to datetime
    creation_time = id_object.generation_time

    # Now let's add this to the document as a new field
    filter = {'_id': id_object}
    new_values = {"$set": {'createdAt': creation_time}}

    db.users.update_one(filter, new_values)

In [35]:
users = []
for doc in db.users.find():
    users.append(doc)
    
users

[{'_id': ObjectId('6447ef3b63bfe2715b407dd4'),
  'name': 'Jefferson II',
  'email': 'jeff@supernova.ai',
  'password': 'not-so-secure',
  'fund_link': '6447ef3b63bfe2715b407dd3',
  'type': 'fund-investor',
  'createdAt': datetime.datetime(2023, 4, 25, 15, 18, 19)},
 {'_id': ObjectId('6448158c1b576b1a4524c721'),
  'name': 'Jefferson',
  'email': 'jeff1@supernova.ai',
  'password': 'not-so-secure',
  'fund_link': '6448158c1b576b1a4524c720',
  'type': 'fund-investor',
  'createdAt': datetime.datetime(2023, 4, 25, 18, 1, 48)},
 {'_id': ObjectId('6448261221f446cb608e7002'),
  'email': 'test@email.com',
  'name': 'John Doe',
  'type': 'company-manager',
  'company_link': '6448261221f446cb608e7001',
  'password': '8x5#SvJ#',
  'createdAt': datetime.datetime(2023, 4, 25, 19, 12, 18)},
 {'_id': ObjectId('644831372a81aa1fc0aa7e31'),
  'email': 'max@supernova.ai',
  'name': 'Max',
  'password': '1oC0!PJf4Y@1',
  'type': 'admin',
  'createdAt': datetime.datetime(2023, 4, 25, 19, 59, 51)},
 {'_id':

And i'll delete it again because I don't know what will happen with those interfaces.

In [38]:
db.users.update_many({}, {'$unset': {'createdAt': ""}})

<pymongo.results.UpdateResult at 0x2430d541db0>

In [39]:
db.users.find_one()

{'_id': ObjectId('6447ef3b63bfe2715b407dd4'),
 'name': 'Jefferson II',
 'email': 'jeff@supernova.ai',
 'password': 'not-so-secure',
 'fund_link': '6447ef3b63bfe2715b407dd3',
 'type': 'fund-investor'}

## Always close

In [50]:
# Close the connection to MongoDB when you're done.
client.close()

## Load Mongodb directly inside pandas 
https://pypi.org/project/pymongoarrow/

In [73]:
import pymongoarrow as pma

In [61]:
from pymongoarrow.monkey import patch_all
patch_all() # to add pymongoarrow's functionality to all collections

```py
from pymongoarrow.api import Schema
schema = Schema({"_id": int, "qty": float})

from pymongo import MongoClient
client = MongoClient()
client.db.data.insert_many(
   [{"_id": 1, "qty": 25.4}, {"_id": 2, "qty": 16.9}, {"_id": 3, "qty": 2.3}]
)
```

In [75]:
db.list_collection_names()

['users',
 'companies',
 'metricValues',
 'exchange',
 'funds',
 'metricMeta',
 'portfolios']

In [74]:
data_frame = db.metricMeta.find_pandas_all({}) #, schema=schema)
data_frame[['code', 'title', 'description']]

Unnamed: 0,code,title,description
0,greenhouse_gas_emissions_avoided,Greenhouse Gas Emissions Avoided,Amount of greenhouse gas (GHG) emissions avoid...
1,greenhouse_gas_emissions_mitigated,Greenhouse Gas Emissions Mitigated,Amount of greenhouse gas (GHG) emissions mitig...
2,km2_of_forest_monitored_for_clients,KM2 of Forest Monitored for Clients,This metric represents the number of square ki...
3,candidates_assessed_bias_free,Candidates Assessed Bias-Free,This metric represents the number of job candi...
4,water_saved,Water Saved,
5,aum_to_invest_in_climate_tech,AuM to Invest in Climate Tech,
6,active_platform_users,Active Platform Users,
7,ghg_emissions_scope_1,GHG Emissions Scope 1,
8,ghg_emissions_scope_2,GHG Emissions Scope 2,
9,ghg_emissions_scope_3,GHG Emissions Scope 3,


## schemas
Not really necessary, only to select columns.

```py
from pymongoarrow.api import Schema
schema = Schema({'_id': int, 'amount': float, 'last_updated': datetime})
```
see: https://mongo-arrow.readthedocs.io/en/latest/data_types.html

## find
We are now ready to query our data. Let’s start by running a find operation to load all records with a non-zero amount as a pandas.DataFrame:

```py
df = client.db.data.find_pandas_all({'amount': {'$gt': 0}}, schema=schema)

# pandas
df = client.db.data.aggregate_pandas_all([{'$match': {'amount': {'$lte': 10}}}], schema=schema)
```


## write back
```py
from pymongoarrow.api import write
from pymongo import MongoClient
coll = MongoClient().db.my_collection
write(coll, df)
```

# Mongodb to streamlit

From tutorial page:

In [None]:
# streamlit_app.py

import streamlit as st
import pymongo

# Initialize connection.
# Uses st.cache_resource to only run once.
@st.cache_resource
def init_connection():
    return pymongo.MongoClient(**st.secrets["mongo"])

client = init_connection()

# Pull data from the collection.
# Uses st.cache_data to only rerun when the query changes or after 10 min.
@st.cache_data(ttl=600) # show_spinner=False
def get_data():
    db = client.mydb
    items = db.mycollection.find()
    items = list(items)  # make hashable for st.cache_data
    return items

items = get_data()

# Print results.
for item in items:
    st.write(f"{item['name']} has a :{item['pet']}:")

writing any text, code or images to mongodb should follow the flow above. `insert_one` or `update_one`

Most important next steps are:

- make the portfolio visible in the dashboard/data.
- save the graphs/code somewhere that runs the code

## Below replaces this in the streamlit visualiser:

```py
url = "https://github.com/JohannesVC/streamlit/tree/master/supernova_app/Visualiser/data/supernova_xs.pickle"
data = requests.get(url).content
datasets["Supernova sample dataset"] = pd.read_pickle(BytesIO(data))
```

In [None]:
from pymongo import MongoClient
import pandas as pd
import os

uri = os.getenv('MONGO_URI')

client = MongoClient(uri)

# database code goes here
db = client.General

df = pd.read_pickle(r'C:\Users\johan\Documents\GitHub\supernova python\supernova_app\visualiser\data\supernova_xs.pickle')

df_dict = df.to_dict(orient='records')

# careful!
# db.visualiser_sample.insert_many(df_dict)

db['visualiser_sample'].find_one()