Skip to content

Commit

Permalink
feat(mode): add mode analytics ingestion source (#3710)
Browse files Browse the repository at this point in the history
  • Loading branch information
gabe-lyons committed Dec 10, 2021
1 parent bd4ecbc commit 8394fc6
Show file tree
Hide file tree
Showing 16 changed files with 1,995 additions and 35 deletions.
Binary file added datahub-web-react/src/images/modelogo.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
71 changes: 36 additions & 35 deletions metadata-ingestion/README.md
Expand Up @@ -32,41 +32,42 @@ We use a plugin architecture so that you can install only the dependencies you a

Sources:

| Plugin Name | Install Command | Provides |
| ----------------------------------------------- | ---------------------------------------------------------- | ----------------------------------- |
| [file](./source_docs/file.md) | _included by default_ | File source and sink |
| [athena](./source_docs/athena.md) | `pip install 'acryl-datahub[athena]'` | AWS Athena source |
| [bigquery](./source_docs/bigquery.md) | `pip install 'acryl-datahub[bigquery]'` | BigQuery source |
| [bigquery-usage](./source_docs/bigquery.md) | `pip install 'acryl-datahub[bigquery-usage]'` | BigQuery usage statistics source |
| [datahub-business-glossary](./source_docs/business_glossary.md) | _no additional dependencies_ | Business Glossary File source |
| [dbt](./source_docs/dbt.md) | _no additional dependencies_ | dbt source |
| [druid](./source_docs/druid.md) | `pip install 'acryl-datahub[druid]'` | Druid Source |
| [feast](./source_docs/feast.md) | `pip install 'acryl-datahub[feast]'` | Feast source |
| [glue](./source_docs/glue.md) | `pip install 'acryl-datahub[glue]'` | AWS Glue source |
| [hive](./source_docs/hive.md) | `pip install 'acryl-datahub[hive]'` | Hive source |
| [kafka](./source_docs/kafka.md) | `pip install 'acryl-datahub[kafka]'` | Kafka source |
| [kafka-connect](./source_docs/kafka-connect.md) | `pip install 'acryl-datahub[kafka-connect]'` | Kafka connect source |
| [ldap](./source_docs/ldap.md) | `pip install 'acryl-datahub[ldap]'` ([extra requirements]) | LDAP source |
| [looker](./source_docs/looker.md) | `pip install 'acryl-datahub[looker]'` | Looker source |
| [lookml](./source_docs/lookml.md) | `pip install 'acryl-datahub[lookml]'` | LookML source, requires Python 3.7+ |
| [mongodb](./source_docs/mongodb.md) | `pip install 'acryl-datahub[mongodb]'` | MongoDB source |
| [mssql](./source_docs/mssql.md) | `pip install 'acryl-datahub[mssql]'` | SQL Server source |
| [mysql](./source_docs/mysql.md) | `pip install 'acryl-datahub[mysql]'` | MySQL source |
| [mariadb](./source_docs/mariadb.md) | `pip install 'acryl-datahub[mariadb]'` | MariaDB source |
| [openapi](./source_docs/openapi.md) | `pip install 'acryl-datahub[openapi]'` | OpenApi Source |
| [oracle](./source_docs/oracle.md) | `pip install 'acryl-datahub[oracle]'` | Oracle source |
| [postgres](./source_docs/postgres.md) | `pip install 'acryl-datahub[postgres]'` | Postgres source |
| [redash](./source_docs/redash.md) | `pip install 'acryl-datahub[redash]'` | Redash source |
| [redshift](./source_docs/redshift.md) | `pip install 'acryl-datahub[redshift]'` | Redshift source |
| [sagemaker](./source_docs/sagemaker.md) | `pip install 'acryl-datahub[sagemaker]'` | AWS SageMaker source |
| [snowflake](./source_docs/snowflake.md) | `pip install 'acryl-datahub[snowflake]'` | Snowflake source |
| [snowflake-usage](./source_docs/snowflake.md) | `pip install 'acryl-datahub[snowflake-usage]'` | Snowflake usage statistics source |
| [sql-profiles](./source_docs/sql_profiles.md) | `pip install 'acryl-datahub[sql-profiles]'` | Data profiles for SQL-based systems |
| [sqlalchemy](./source_docs/sqlalchemy.md) | `pip install 'acryl-datahub[sqlalchemy]'` | Generic SQLAlchemy source |
| [superset](./source_docs/superset.md) | `pip install 'acryl-datahub[superset]'` | Superset source |
| [trino](./source_docs/trino.md) | `pip install 'acryl-datahub[trino]` | Trino source |
| [starburst-trino-usage](./source_docs/trino.md) | `pip install 'acryl-datahub[starburst-trino-usage]'` | Starburst Trino usage statistics source |
| [nifi](./source_docs/nifi.md) | `pip install 'acryl-datahub[nifi]' | Nifi source |
| Plugin Name | Install Command | Provides |
|-----------------------------------------------------------------|------------------------------------------------------------| ----------------------------------- |
| [file](./source_docs/file.md) | _included by default_ | File source and sink |
| [athena](./source_docs/athena.md) | `pip install 'acryl-datahub[athena]'` | AWS Athena source |
| [bigquery](./source_docs/bigquery.md) | `pip install 'acryl-datahub[bigquery]'` | BigQuery source |
| [bigquery-usage](./source_docs/bigquery.md) | `pip install 'acryl-datahub[bigquery-usage]'` | BigQuery usage statistics source |
| [datahub-business-glossary](./source_docs/business_glossary.md) | _no additional dependencies_ | Business Glossary File source |
| [dbt](./source_docs/dbt.md) | _no additional dependencies_ | dbt source |
| [druid](./source_docs/druid.md) | `pip install 'acryl-datahub[druid]'` | Druid Source |
| [feast](./source_docs/feast.md) | `pip install 'acryl-datahub[feast]'` | Feast source |
| [glue](./source_docs/glue.md) | `pip install 'acryl-datahub[glue]'` | AWS Glue source |
| [hive](./source_docs/hive.md) | `pip install 'acryl-datahub[hive]'` | Hive source |
| [kafka](./source_docs/kafka.md) | `pip install 'acryl-datahub[kafka]'` | Kafka source |
| [kafka-connect](./source_docs/kafka-connect.md) | `pip install 'acryl-datahub[kafka-connect]'` | Kafka connect source |
| [ldap](./source_docs/ldap.md) | `pip install 'acryl-datahub[ldap]'` ([extra requirements]) | LDAP source |
| [looker](./source_docs/looker.md) | `pip install 'acryl-datahub[looker]'` | Looker source |
| [lookml](./source_docs/lookml.md) | `pip install 'acryl-datahub[lookml]'` | LookML source, requires Python 3.7+ |
| [mode](./source_docs/mode.md) | `pip install 'acryl-datahub[mode]'` | Mode Analytics source |
| [mongodb](./source_docs/mongodb.md) | `pip install 'acryl-datahub[mongodb]'` | MongoDB source |
| [mssql](./source_docs/mssql.md) | `pip install 'acryl-datahub[mssql]'` | SQL Server source |
| [mysql](./source_docs/mysql.md) | `pip install 'acryl-datahub[mysql]'` | MySQL source |
| [mariadb](./source_docs/mariadb.md) | `pip install 'acryl-datahub[mariadb]'` | MariaDB source |
| [openapi](./source_docs/openapi.md) | `pip install 'acryl-datahub[openapi]'` | OpenApi Source |
| [oracle](./source_docs/oracle.md) | `pip install 'acryl-datahub[oracle]'` | Oracle source |
| [postgres](./source_docs/postgres.md) | `pip install 'acryl-datahub[postgres]'` | Postgres source |
| [redash](./source_docs/redash.md) | `pip install 'acryl-datahub[redash]'` | Redash source |
| [redshift](./source_docs/redshift.md) | `pip install 'acryl-datahub[redshift]'` | Redshift source |
| [sagemaker](./source_docs/sagemaker.md) | `pip install 'acryl-datahub[sagemaker]'` | AWS SageMaker source |
| [snowflake](./source_docs/snowflake.md) | `pip install 'acryl-datahub[snowflake]'` | Snowflake source |
| [snowflake-usage](./source_docs/snowflake.md) | `pip install 'acryl-datahub[snowflake-usage]'` | Snowflake usage statistics source |
| [sql-profiles](./source_docs/sql_profiles.md) | `pip install 'acryl-datahub[sql-profiles]'` | Data profiles for SQL-based systems |
| [sqlalchemy](./source_docs/sqlalchemy.md) | `pip install 'acryl-datahub[sqlalchemy]'` | Generic SQLAlchemy source |
| [superset](./source_docs/superset.md) | `pip install 'acryl-datahub[superset]'` | Superset source |
| [trino](./source_docs/trino.md) | `pip install 'acryl-datahub[trino]` | Trino source |
| [starburst-trino-usage](./source_docs/trino.md) | `pip install 'acryl-datahub[starburst-trino-usage]'` | Starburst Trino usage statistics source |
| [nifi](./source_docs/nifi.md) | `pip install 'acryl-datahub[nifi]' | Nifi source |

Sinks

Expand Down
20 changes: 20 additions & 0 deletions metadata-ingestion/examples/mce_files/data_platforms.json
Expand Up @@ -197,6 +197,26 @@
},
"proposedDelta": null
},
{
"auditHeader": null,
"proposedSnapshot": {
"com.linkedin.pegasus2avro.metadata.snapshot.DataPlatformSnapshot": {
"urn": "urn:li:dataPlatform:mode",
"aspects": [
{
"com.linkedin.pegasus2avro.dataplatform.DataPlatformInfo": {
"datasetNameDelimiter": ".",
"name": "mode",
"displayName": "Mode",
"type": "KEY_VALUE_STORE",
"logoUrl": "https://raw.githubusercontent.com/linkedin/datahub/master/datahub-web-react/src/images/modelogo.png"
}
}
]
}
},
"proposedDelta": null
},
{
"auditHeader": null,
"proposedSnapshot": {
Expand Down
16 changes: 16 additions & 0 deletions metadata-ingestion/examples/recipes/mode_to_datahub.yml
@@ -0,0 +1,16 @@
# see https://datahubproject.io/docs/metadata-ingestion/source_docs/metabase for complete documentation
source:
type: "mode"
config:
token: 9fa6a90fcd33
password: a03bcbc011d6f77c585f5682
connect_uri: https://app.mode.com/
workspace: "petabloc"
default_schema: "public"
owner_username_instead_of_email: False

# see https://datahubproject.io/docs/metadata-ingestion/sink_docs/datahub for complete documentation
sink:
type: "datahub-rest"
config:
server: "http://localhost:8080"
2 changes: 2 additions & 0 deletions metadata-ingestion/setup.py
Expand Up @@ -112,6 +112,7 @@ def get_long_description():
"ldap": {"python-ldap>=2.4"},
"looker": looker_common,
"lookml": looker_common | {"lkml>=1.1.0", "sql-metadata==2.2.2"},
"mode": {"requests", "sqllineage"},
"mongodb": {"pymongo>=3.11"},
"mssql": sql_common | {"sqlalchemy-pytds>=0.3"},
"mssql-odbc": sql_common | {"pyodbc"},
Expand Down Expand Up @@ -282,6 +283,7 @@ def get_long_description():
"looker = datahub.ingestion.source.looker:LookerDashboardSource",
"lookml = datahub.ingestion.source.lookml:LookMLSource",
"datahub-business-glossary = datahub.ingestion.source.metadata.business_glossary:BusinessGlossaryFileSource",
"mode = datahub.ingestion.source.mode:ModeSource",
"mongodb = datahub.ingestion.source.mongodb:MongoDBSource",
"mssql = datahub.ingestion.source.sql.mssql:SQLServerSource",
"mysql = datahub.ingestion.source.sql.mysql:MySQLSource",
Expand Down
113 changes: 113 additions & 0 deletions metadata-ingestion/source_docs/mode.md
@@ -0,0 +1,113 @@
# Mode

For context on getting started with ingestion, check out our [metadata ingestion guide](../README.md).

## Setup

To install this plugin, run `pip install 'acryl-datahub[mode]'`.

See documentation for Mode's API at https://mode.com/developer/api-reference/introduction/


## Capabilities

This plugin extracts Charts, Reports, and associated metadata from a given Mode workspace. This plugin is in beta and has only been tested
on PostgreSQL database.

### Report

[/api/{account}/reports/{report}](https://mode.com/developer/api-reference/analytics/reports/) endpoint is used to
retrieve the following report information.

- Title and description
- Last edited by
- Owner
- Link to the Report in Mode for exploration
- Associated charts within the report

### Chart

[/api/{workspace}/reports/{report}/queries/{query}/charts'](https://mode.com/developer/api-reference/analytics/charts/#getChart) endpoint is used to
retrieve the following information.

- Title and description
- Last edited by
- Owner
- Link to the chart in Metabase
- Datasource and lineage information from Report queries.

The following properties for a chart are ingested in DataHub.

#### Chart Information
| Name | Description |
|-----------|----------------------------------------|
| `Filters` | Filters applied to the chart |
| `Metrics` | Fields or columns used for aggregation |
| `X` | Fields used in X-axis |
| `X2` | Fields used in second X-axis |
| `Y` | Fields used in Y-axis |
| `Y2` | Fields used in second Y-axis |


#### Table Information
| Name | Description |
|-----------|------------------------------|
| `Columns` | Column names in a table |
| `Filters` | Filters applied to the table |



#### Pivot Table Information
| Name | Description |
|-----------|----------------------------------------|
| `Columns` | Column names in a table |
| `Filters` | Filters applied to the table |
| `Metrics` | Fields or columns used for aggregation |
| `Rows` | Row names in a table |

## Quickstart recipe

Check out the following recipe to get started with ingestion! See [below](#config-details) for full configuration options.

For general pointers on writing and running a recipe, see our [main recipe guide](../README.md#recipes).

```yml
source:
type: mode
config:
# Coordinates
connect_uri: http://app.mode.com

# Credentials
token: token
password: pass

# Options
workspace: "datahub"
default_schema: "public"

sink:
# sink configs
```

## Config details

| Field | Required | Default | Description |
|------------------| -------- |--------------------------|-------------------------------------------------------------------|
| `connect_uri` || `"https://app.mode.com"` | Mode host URL. |
| `token` || | Mode user token. |
| `password` || | Mode password for authentication. |
| `default_schema` | | `public` | Default schema to use when schema is not provided in an SQL query |
| `env` | | `"PROD"` | Environment to use in namespace when constructing URNs. |

See Mode's [Authentication documentation](https://mode.com/developer/api-reference/authentication/) on how to generate `token` and `password`.

## Compatibility

Coming soon!


## Questions

If you've got any questions on configuring this source, feel free to ping us on
[our Slack](https://slack.datahubproject.io/)!

0 comments on commit 8394fc6

Please sign in to comment.