# Overview of CIM Profiles


One of the common misconceptions regarding the CIM is that the entire information model must be adopted and implemented. As an enterprise-focused canonical information model, the full CIM covers an extremely broad range of aspects of modeling, operations, billing, asset management, and energy markets. Any given application or even an entire data-rich environment will only need a small subset of the full common information model. 

When choosing CIM for a particular project, application, and data integration effort, it is important to consider the particular use case, what functional objectives must be met, and what is the minimum number of attributes that must be modeled to meet the identified requirements. This section will briefly discuss some of the consideration involved in selecting a CIM profile, building a data profile, populating that profile with power system data, and then selecting a database structure to contain that data.

## What is a CIM Profile?

Before introducing physical models, a clear understanding of profiles is essential. **Profiles are defined as secondary models derived from the information model.** Profiles must be based on classes, attributes, and associations contained in the CIM information model. They can never introduce anything new beyond what exists in the canonical CIM.

The purpose of profiles is to identify a **subset of the information model** to meet a particular need. Profiles can:
- Constrain cardinality of attributes
- Remove unwanted attributes and classes
- Represent structures required for specific applications
- Focus on particular domains (equipment, topology, geography, state estimation, etc.)

After completing the logical model in UML, profiles can be derived to depict structures that support specific topics such as equipment profiles, geographical profiles, single state hypothesis profiles, and topology profiles.

## Summary and Next Steps

This overview introduced the key concepts of CIM profiling:

1. **Profiles** are subsets of the full CIM information model tailored to specific use cases
2. **Data profiles** are physical serializations of CIM profiles (XML, JSON, Python dataclasses)
3. **Profile selection** involves understanding business processes, use cases, and minimum required information
4. **mRIDs** (Master Resource Identifiers) are critical for consistent cross-application integration
5. **CIMantic Graphs** uses Python dataclass schemas as data profiles for native Python integration

### What's Next:

- **Building Profiles** - Learn how to create custom profiles using CIMTool
- **Using Objects** - Work with CIM dataclass objects in Python
- **Incrementals** - Manage model changes and updates
- **Units** - Handle physical quantities with proper unit conversion

The following notebooks will dive deeper into each of these topics with practical examples.

In [None]:
# Inspect attributes of a specific class
from dataclasses import fields

print("\\nACLineSegment attributes:")
for field in fields(cim.ACLineSegment):
    field_type = field.metadata.get('type', 'unknown')
    print(f"  - {field.name}: {field_type}")
    if field_type == 'Association':
        inverse = field.metadata.get('inverse', '')
        print(f"      (inverse: {inverse})")

In [None]:
# Import a CIM profile (CIM 17 version 40)
import cimgraph.data_profile.cim17v40 as cim

# Inspect what classes are available
print("Sample classes in this profile:")
print(f"  - {cim.ACLineSegment}")
print(f"  - {cim.PowerTransformer}")
print(f"  - {cim.EnergyConsumer}")
print(f"  - {cim.Terminal}")
print(f"  - {cim.ConnectivityNode}")

## Example: Importing a CIM Profile

Let's see how to import and inspect a CIM profile in CIMantic Graphs:

## CIMantic Graphs and Python Data Profiles

**CIMantic Graphs** provides a Python-native approach to working with CIM profiles. Instead of XML or JSON schemas, it uses **Python dataclass schemas** as the data profile format.

### Advantages of Python Dataclasses:

1. **Native Python Integration** - Classes can be imported and used directly in Python code
2. **Type Hints** - Full support for Python type checking and IDE autocomplete
3. **Introspection** - Easy to programmatically inspect attributes, types, and metadata
4. **In-Memory Graphs** - Efficient knowledge graph representation in memory
5. **Auto-Generated Queries** - Database queries generated directly from dataclass structure

### The CIMTool Builder Workflow:

1. Create or select a CIM Profile in CIMTool (UML-based)
2. Use the **cimantic-graphs.xsl** builder to generate Python dataclass schema
3. Use the **cimantic-graphs-init.xsl** builder to generate `__init__.py`
4. Place generated files in `cimgraph/data_profile/your_profile_name/`
5. Import and use: `import cimgraph.data_profile.your_profile_name as cim`

This approach maintains all the benefits of CIM profiling while providing a Pythonic development experience.

### 7. The Critical Role of mRIDs (Master Resource Identifiers)

One of the most critical aspects of populating CIM models is establishing **persistent Universally Unique Identifiers (UUIDs)** to be used as **Master Resource Identifiers (mRIDs)** for all network parameters, asset characteristics, and SCADA measurements.

#### Why mRIDs Matter:

1. **Cross-Application Consistency** - All applications in the data integration effort must refer to equipment by the same mRID. Inconsistent mRIDs require mapping tables, defeating the purpose of CIM-based integration.

2. **Data Mapping Stability** - Changes to mRIDs affect other data mappings (e.g., SCADA measurements to assets). Persistent mRIDs ensure edits to network models don't break other data mappings.

3. **Inter-Utility Exchange** - When exchanging data between utilities:
   - **Option 1:** Use the same mRIDs across both utilities (requires coordination and master lists)
   - **Option 2:** Create reference tables converting mRIDs between utilities

**Best Practice:** Use UUID format (RFC 4122) for mRIDs to ensure global uniqueness.

#### Measurement mRIDs

CIM requires every measurement or data source be assigned a unique mRID associated with one terminal of equipment:

- PT voltage measurement → unique mRID → associated with Terminal
- CT current measurement → unique mRID → associated with Terminal  
- Calculated MW/MVAr/MVA → unique mRIDs → associated with Terminal

This approach simplifies real-time publishing: measurements can be sent as simple `{"meas-mrid-1234": value}` pairs without additional metadata, since all context is in the network model.

### 6. Populate the Data Profile with Network Model Data

Populating the data profile with actual power system data typically requires custom scripts that:
1. Interpret the empty data profile structure
2. Read source data from existing tools (PSLF, OpenDSS, etc.)
3. Build XML, JSON, or other format files containing actual network model data

**Validation:** Multiple validation levels ensure data quality:
- **Syntactic validation** - XML parsers check if files are "well-formed"
- **Semantic validation** - XSD/JSON schemas ensure objects and attributes are defined correctly
- **Ontological validation** - RDFS and OWL ensure the model follows CIM vocabulary

To date, no convenient tools exist for building distribution feeders or transmission networks natively in CIM format. The practical approach is to build models in existing tools and convert them to CIM.

### 5. Create a Data Profile

Once a CIM Profile has been selected or created, create a **data profile** to contain the power system network model. Remember: the data profile is not the data itself—it's an empty file structure specifying how data should be organized.

Data profiles may be derived directly from UML information models or UML profiles. Common formats include:
- **XML Schema Definition (XSD)** - Popular for validating XML files
- **JSON Schema** - Identical in structure to Python dictionaries, easily parsed
- **Relational DDL** - For SQL databases
- **RDF Serialization** - For semantic web applications
- **Python Dataclasses** - Used by CIMantic Graphs

#### Layered Data Profiles

Data profiles can be layered like GIS layers to build up complete power system representations:

1. **Topology Profile** - Describes how buses, switches, and branches connect (connectivity using CIM vocabulary)
2. **Equipment Profile** - Contains nameplate info, physical characteristics, line impedances
3. **Geographical Profile** - Provides geospatial information about asset locations
4. **State Profile** - Represents dynamic operational states and measurements

This layered approach provides flexibility in what information is exchanged and stored.

### 4. Identify the Required Classes and Attributes

Once the minimum required information has been identified, reduce the CIM information model to a much smaller profile with only the classes, attributes, and associations that are needed. This custom subset is your **CIM Profile**.

**Example:** For DER-to-substation topology mapping, an extremely small profile suffices:
- `ACLineSegment`
- `LoadBreakSwitch`
- `PowerTransformerEnd`
- `SynchronousMachine`
- `PowerElectronicsConnection`
- `ConnectivityNode`
- `ACDCTerminal`

If power flow solutions aren't needed, classes like `PhaseImpedanceData`, `TransformerMeshImpedance`, and `RatioTapChanger` can be omitted.

You can build a custom CIM profile using open-source tools like **CIMTool** or adopt an existing tested profile. If extensions beyond the base CIM are needed (e.g., home appliances, DER control attributes), involve a domain expert with expertise in both power systems and information modeling.

### 3. Identify the Required Information to be Modeled

Although it's possible to use the entire CIM to represent every object and attribute in the power system, it's more practical to reduce the modeling scope to the **minimum information required** to accomplish the functional objectives. This requires careful consideration of:

- What data needs to be exchanged
- Whether a full power flow solution is needed
- The level of modeling accuracy required

**Example:** For a DER-to-substation mapping application, the minimum required information is simply a topology model of the network connectivity (lines, transformers, switches, DERs). Detailed modeling of line geometry, transformer impedances, tap settings, and reactive control modes is unnecessary since these don't impact connectivity.

### 2. Identify the Use Case for CIM

The next step is identifying the particular use case for adopting CIM and the scope of data integration required. Possible approaches include:

- **Project-Level:** Strictly using CIM to exchange base power system models between applications
- **Platform-Level:** Integrating applications through a common message bus with CIM-compliant messaging and models
- **Integrated Environment:** Creating a data-rich environment using CIM classes, attributes, and message formats to the greatest extent possible

The decision depends on the particular use case and types of data that need to be exchanged. For real-time applications requiring topology mapping with SCADA data, DER dispatch commands, and dynamic switch positions, a CIM-based message bus is more practical than point-to-point exchanges.

## Steps for Selecting and Creating a CIM Profile

### 1. Understand Business, Planning, and Operations Processes

Adoption of the CIM cannot occur in a vacuum and should consider the set of operational procedures and business processes that will be impacted. This process involves:

- Understanding organizational units and groups involved
- Documenting existing operational procedures
- Identifying current data storage formats
- Mapping legacy application interfaces and data streams
- Translating requirements into functional objectives and performance specifications

**Important Consideration:** If the project involves operations or planning decisions, ensure data is presented in a format meaningful to operators and dispatchers. Many CIM components are not human-readable in traditional operations contexts, so proper human factors engineering is essential to avoid adding cognitive workload in high-stress control room environments.

## CIM Profile vs Data Profile

It's important to distinguish between these two related but distinct concepts:

**CIM Profile:** A UML-based subset of the CIM information model defining which classes, attributes, and associations are needed for a specific application or use case.

**Data Profile:** A serialization of a CIM Profile into a specific physical data structure (repository or stream). Examples include:
- JavaScript Object Notation (JSON) schema
- eXtensible Schema Definition (XSD) schema
- Relational data definition language (DDL)
- Resource Description Framework (RDF) serialization
- **Python dataclass schemas** (used by CIMantic Graphs)

The data profile is not the power system data itself—it's an empty file structure specifying how network model data should be organized.