# Schema in MongoDB

## What is data modeling?

Data modeling refers to the organization of data within a database and the links between related entities.
Data in MongoDB has a flexible schema model, which means:
- Documents within a single collection are *not required* to have the same set of fields.
- A field's data type can differ between documents within a collection.

Generally, documents in a collection share a similar structure. To ensure consistency in your data model, you can create schema validation rules.

## Why do we need data modeling?

Data modeling is essential for several reasons:

1. **Data organization**: Data modeling helps structure and organize data within a database. By defining the relationships between entities and establishing a schema, data modeling provides a framework for storing and retrieving information efficiently.

2. **Data integrity**: A well-designed data model ensures data integrity by enforcing constraints and rules on the data stored in the database. This helps prevent data inconsistencies, inaccuracies, and anomalies.

3. **Data consistency**: Data modeling allows you to define standards and guidelines for how data should be stored and represented. This ensures that data is consistent across different documents or entities, making it easier to query and analyze.

4. **Scalability**: Data modeling plays a crucial role in designing a scalable database system. By carefully modeling the data and considering factors such as data growth, access patterns, and future requirements, you can design a database that can scale with increasing data volumes and user demands.

## Entity-relationship(ER) data modeling framework

In MongoDB, there are certain patterns within documents that is about data modeling.

Data modeling is a process which identifies relevant data, how that data should be captured and worked with, and ultimately, how the data can be visualized as a diagram. This visual representation not only helps identify all data components, but also helps determine the relationships among data elements while finding the best way to demonstrate those relationships.

Entity-relationship(ER) diagram represents the structure of the database with the help of a diagram. ER Modelling is a systematic process to design a database as it would require you to analyze all data requirements before implementing your database.

At the end, we can represent the data model of our database as this diagram:
![image.png](attachment:image.png)

Data models consist of the following components:

- Entity: Entity is defined as an independent object that is also a logical component in the system. Entities can be categorized as tangible and intangible. This means that tangible entities (such as books) exist in the real world, while intangible entities (such as book loans) don’t have a physical form.

    - Entity instances: Entity instances describe the specific instance of an entity group. For example, the tangible book entity “Alice in Wonderland” belongs to the specific instance of “book.”

    - Attributes: Attributes describe the characteristics of an entity. For example, the entity “book” has the attributes International Standard Book Number or ISBN (String) and title (String).

- Relationships: Relationships define the connections between the entities. For example, one user can borrow many books at a time. The relationship between the entities "users" and "books" is one to many.

## Identifying entities of our database

The first part of data modeling is identifying entities of our data model. 

It this step, we also need to identify entity instances and their attributes. It requires business domain expertise, and also to be able to predict current and predicted use case scenarios.  And we will use one collection to model each entity.

## What is a Schema?

Schema is an essential concept in database. Every collection has a schema object. 
![image.png](attachment:image.png)

For each entity in the data model, we expect there will be some some common structure among all documents. A schema is a JSON object that defines the structure and contents of your data. 

We use JSON to define your application's data model and validate documents whenever they're created, changed, or deleted.

Schemas represent types of data rather than specific values. App Services supports many built-in schema types. These include primitives, like strings and numbers, as well as structural types, like objects and arrays, which you can combine to create schemas that represent custom object types.

For example, this is a basic schema for data about cars and some car objects that conform to the schema:

```javascript
{
  "title": "car",
  "required": [
     "_id",
     "year",
     "make",
     "model",
     "miles"
  ],
  "properties": {
    "_id": { "bsonType": "objectId" },
    "year": { "bsonType": "string" },
    "make": { "bsonType": "string" },
    "model": { "bsonType": "string" },
    "miles": { "bsonType": "number" }
  }
}
```
