# Database Migration Assistant for Azure Cosmos DB API for MongoDB

This notebook can be used to gather source environment details and assess incompatibilities while migrating from your native MongoDB instance to Azure Cosmos DB API for MongoDB.

**Please update the input parameter below for the assessment.** **This would be the source cluster endpoint against which you want to run the assessment.**

In [1]:
source_connection_string = ""

In [2]:
from source_mongodb import SourceMongoDB
source_mongodb = SourceMongoDB(endpoint=source_connection_string)

**Gather source environment info**

You may provide these details to the Microsoft points of contact for TCO calculation or migration discussions.

In [3]:
source_mongodb.get_environment_info()
source_mongodb.save_environment_info_to_csv()
source_mongodb.print_environment_info()

MongoDB version:  4.0.23
License Type:  Community
Is Sharded endpoint: No


**Gather source workload info**

Workload info gets stored in Csv output files _workload\_database\_details.csv_ and _workload\_collection\_details.csv_ in the same path as the notebook.

In [4]:
source_mongodb.get_workload_info()
source_mongodb.save_workload_info_to_csv()
source_mongodb.print_workload_info()

Workload database details: 


Unnamed: 0,DB Name,Collection Count,Doc Count,Avg Doc Size,Data Size,Index Count,Index Size
1,bookstoretest,2,192200,4144,796572532,7,260636672
2,cosmosbookstore,1,96604,4145,400497620,1,1814528
3,geo,2,25554,252,6446542,2,266240
4,kagglemeta,2,87934912,190,16725184704,2,891363328
5,pe_orig,2,57703820,668,38561434711,2,861605888
6,portugeseelection,2,30230038,687,20782985862,1,450932736
7,sample_mflix,5,75583,691,52300763,5,798720
8,test,1,22,545,12003,0,0
9,testcol,26,46,88,4082,32,589824
10,testhav,3,2,528,1057,3,36864


Workload collection details: 


Unnamed: 0,DB Name,Collection Name,Doc Count,Avg Doc Size,Data Size,Index Count,Index Size,Indexes
1,bookstoretest,books,96100,4144,398286266,5,89419776,"{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'bookstoretest.books'}, 'rating_-1': {'v': 2, 'key': [('rating', -1.0)], 'ns': 'bookstoretest.books'}, 'desc_text': {'v': 2, 'key': [('_fts', 'text'), ('_ftsx', 1)], 'ns': 'bookstoretest.books', 'weights': SON([('desc', 1)]), 'default_language': 'english', 'language_override': 'language', 'textIndexVersion': 3}, 'title_1_reviews_-1': {'v': 2, 'key': [('title', 1.0), ('reviews', -1.0)], 'ns': 'bookstoretest.books'}, 'title_1_reviewcomments.name_1': {'v': 2, 'key': [('title', 1.0), ('reviewcomments.name', 1.0)], 'ns': 'bookstoretest.books'}}"
2,bookstoretest,books1,96100,4144,398286266,2,171216896,"{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'bookstoretest.books1'}, '$**_text': {'v': 2, 'key': [('_fts', 'text'), ('_ftsx', 1)], 'ns': 'bookstoretest.books1', 'weights': SON([('$**', 1)]), 'default_language': 'english', 'language_override': 'language', 'textIndexVersion': 3}}"
3,cosmosbookstore,books,96604,4145,400497620,1,1814528,"{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'cosmosbookstore.books'}}"
4,geo,neighborhoods,195,17383,3389859,1,16384,"{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'geo.neighborhoods'}}"
5,geo,restaurants,25359,120,3056683,1,249856,"{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'geo.restaurants'}}"
6,kagglemeta,episodeagentscol,30632000,190,5826662261,1,310607872,"{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'kagglemeta.episodeagentscol'}}"
7,kagglemeta,episodeagents,57302912,190,10898522443,1,580755456,"{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'kagglemeta.episodeagents'}}"
8,pe_orig,system.views,0,0,0,1,20480,"{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'pe_orig.system.views'}}"
9,pe_orig,tweets,57703820,668,38561434711,1,861585408,"{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'pe_orig.tweets'}}"
10,portugeseelection,tweetscol,30230000,687,20782964055,1,450932736,"{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'portugeseelection.tweetscol'}}"


**Run assessment**

Here we run assessments for unsupported features, partially supported features and limit warnings for Cosmos DB API for MongoDB.  
We run checks for the most commonly observed scenarios in these categories.

In [5]:
source_mongodb.workload_info.assess_unsupported_features()
source_mongodb.workload_info.save_assessment_result_unsupported()

source_mongodb.workload_info.assess_partially_supported_features()
source_mongodb.workload_info.save_assessment_result_partially_supported()

source_mongodb.workload_info.assess_limits()
source_mongodb.workload_info.save_assessment_result_limits()

source_mongodb.workload_info.print_assessment_results()

Assessment for unsupported features begins...
Assessment for unsupported features completed. Some results were found and will be printed once all the assessments complete.
Assessment for partially supported features begins...
Assessment for partially supported features completed. Some results were found and will be printed once all the assessments complete.
Assessment for limits begins...
Assessment for limits completed. Some results were found.


Assessment results: 


Unnamed: 0,Category,Sub-category,DB Name,Collection Name,Index,Message
1,Unsupported feature,Text index,bookstoretest,books,"{'v': 2, 'key': [('_fts', 'text'), ('_ftsx', 1)], 'ns': 'bookstoretest.books', 'weights': SON([('desc', 1)]), 'default_language': 'english', 'language_override': 'language', 'textIndexVersion': 3}",Text indexes are not supported in Azure Cosmos DB API for MongoDB. Azure Cosmos DB is a crucial part of the Azure ecosystem and is well integrated with other Azure services like Azure Search which offer advanced search features like wildcard search etc. We recommend using Azure Search for full text search functionalities.
2,Unsupported feature,Text index,bookstoretest,books1,"{'v': 2, 'key': [('_fts', 'text'), ('_ftsx', 1)], 'ns': 'bookstoretest.books1', 'weights': SON([('$**', 1)]), 'default_language': 'english', 'language_override': 'language', 'textIndexVersion': 3}",Text indexes are not supported in Azure Cosmos DB API for MongoDB. Azure Cosmos DB is a crucial part of the Azure ecosystem and is well integrated with other Azure services like Azure Search which offer advanced search features like wildcard search etc. We recommend using Azure Search for full text search functionalities.
3,Partially supported feature,Unique index,testcol,col4,"{'v': 2, 'unique': True, 'key': [('instock.warehouse', 1.0)], 'ns': 'testcol.col4'}",Unique indexes can only be created on empty collections currently in Azure Cosmos DB API for MongoDB. Please make sure you migrate the data to Cosmos DB after creating the index. We are currently working on a fix to allow creating unique indexes on non-empty collections. This functionality will be available soon.
4,Partially supported feature,Unique index,testcol,col3,"{'v': 2, 'unique': True, 'key': [('size.uom', 1.0)], 'ns': 'testcol.col3'}",Unique indexes can only be created on empty collections currently in Azure Cosmos DB API for MongoDB. Please make sure you migrate the data to Cosmos DB after creating the index. We are currently working on a fix to allow creating unique indexes on non-empty collections. This functionality will be available soon.
5,Partially supported feature,Compound index with nested field,bookstoretest,books,"{'v': 2, 'key': [('title', 1.0), ('reviewcomments.name', 1.0)], 'ns': 'bookstoretest.books'}","Compound indexes with nested fields are not fully supported in Azure Cosmos DB API for MongoDB. If you are using compound index where the nested fields are docs (not arrays), you may raise a support ticket to enable the functionality."
6,Partially supported feature,Compound index with nested field,testcol,col2,"{'v': 2, 'key': [('item', 1.0), ('size.h', -1.0), ('dimension.w', -1.0)], 'ns': 'testcol.col2'}","Compound indexes with nested fields are not fully supported in Azure Cosmos DB API for MongoDB. If you are using compound index where the nested fields are docs (not arrays), you may raise a support ticket to enable the functionality."
7,Partially supported feature,Compound index with nested field,testcol,col1,"{'v': 2, 'key': [('item', 1.0), ('size.h', -1.0)], 'ns': 'testcol.col1'}","Compound indexes with nested fields are not fully supported in Azure Cosmos DB API for MongoDB. If you are using compound index where the nested fields are docs (not arrays), you may raise a support ticket to enable the functionality."
8,Partially supported feature,Unique index with nested field,testcol,col4,"{'v': 2, 'unique': True, 'key': [('instock.warehouse', 1.0)], 'ns': 'testcol.col4'}","Unique indexes with nested fields are not fully supported in Azure Cosmos DB API for MongoDB. If you are using unique index where the nested fields are docs (not arrays), you may raise a support ticket to enable the functionality."
9,Partially supported feature,Unique index with nested field,testcol,col3,"{'v': 2, 'unique': True, 'key': [('size.uom', 1.0)], 'ns': 'testcol.col3'}","Unique indexes with nested fields are not fully supported in Azure Cosmos DB API for MongoDB. If you are using unique index where the nested fields are docs (not arrays), you may raise a support ticket to enable the functionality."
10,Partially supported feature,TTL index,testcol,col1,"{'v': 2, 'key': [('lastModifieddate', 1.0)], 'ns': 'testcol.col1', 'expireAfterSeconds': 3600.0}",Currently TTL indexes can only be created on _ts field in Azure Cosmos DB API for MongoDB. The _ts field is specific to Azure Cosmos DB and is not accessible from MongoDB clients. It is a reserved (system) property that contains the time stamp of the document's last modification. We will soon be supporting TTL indexes on all fields.


Zip together the outputs from the Database Migration Assistant.

In [6]:
source_mongodb.zip_dma_outputs()