Skip to content

DataHub v0.9.0

Compare
Choose a tag to compare
@szalai1 szalai1 released this 13 Oct 11:26
0427122

Release Highlights

Known Issues

Assertions Tab UX bug

This release introduced a bug in the assertions tab causing assertion results to be hidden. This will be addressed in the subsequent release.

Release Notes

We’re excited to announce the release of DataHub v0.9.0!

This minor release includes an upgrade to Java 11 and surfacing Column-Level Lineage support within the DataHub UI.

Here are some additional highlights:

User Experience

  • Column-Level Lineage is now surfaced within the DataHub UI!
  • Advanced Search now supports searching by Column-level details (i.e. name, description, tag, etc.), as well as complex AND/OR statements. For example:
    • Show results that match any filters
    • Show results that match all filters
    • Owner is either of Shannon or Mark
    • Oner is not Shannon nor Mark
    • Try it in demo here
  • You can now add invite users and assign them to a default DataHub Role
  • Improvements to site performance during the Browse experience

Developer Experience

  • DataHub has been upgraded to Java 11!
  • Improved tracking of GraphQL errors for bug resolution
  • CorpUser and CorpGroup are now available via the Python SDK

Metadata Ingestion

  • Automatically extract Column-Level Lineage from Snowflake & Looker sources
  • dbt Meta Mapping is now supported at the Column Level - this means you can automatically extract Tags and Glossary Terms from your dbt model and surface them in DataHub

What's Changed

  • fix(ingest): bigquery-beta - Getting datasets with biquery client by @treff7es in #6039
  • feat(roles): add ability to invite users into a role by @aditya-radhakrishnan in #6015
  • refactor(java11) - convert most modules to java 11 by @leifker in #5836
  • docs(readme): Fixing broken article link by @davrax in #6042
  • refactor(ingest): streamline pydantic configs by @hsheth2 in #6011
  • docs(ingest): add example of dbt column_meta_mapping by @hsheth2 in #6038
  • refactor(ingest): use aspect map in transformers by @hsheth2 in #6040
  • feat(ui): Adding placeholder entity for DataPlatform by @jjoyce0510 in #6045
  • feat(ingest): implement compression for CheckpointState by @alexey-kravtsov in #6007
  • feat(advanced-search): adding select value modal by @gabe-lyons in #6026
  • fix(ingest): bigquery-beta - Additional fixes for Bigquery beta by @treff7es in #6051
  • feat(advanced search): adding advanced search filter component & prereqs for it by @gabe-lyons in #6055
  • docs(ingest): add path spec examples for s3 by @mayurinehate in #6050
  • fix(deps): metadata-io - remove parquet dependency by @shirshanka in #6046
  • fix(ingestion): Tableau test case execution fix by @mohdsiddique in #6005
  • feat(ingest): list referenced env variables in recipe by @hsheth2 in #6043
  • fix(ingest): compat with mypy 0.981 by @hsheth2 in #6056
  • fix(elasticsearch_index): create datahub_usage_event index where datahub_analytics_enabled set to false by @GyuhoonK in #5974
  • docs(approval workflows): adding approval workflow docs by @gabe-lyons in #5896
  • feat(retention): disable applying retention on bootstrap by @anshbansal in #6066
  • fix(ingest): correct tableau browse paths by @hsheth2 in #6064
  • fix(ingest): bigquery-beta - handling complex types properly by @treff7es in #6062
  • docs: create SECURITY.md by @laulpogan in #6069
  • fix(containers): show soft deleted status of containers by @gabe-lyons in #6072
  • docs(ingest): clarify bigquery-beta multiproject setup by @hsheth2 in #6071
  • chore(setup): change defaults for partitions by @anshbansal in #6074
  • refactor(browse): Improving Browse Feature Performance by @jjoyce0510 in #6073
  • feat(ingest): add column-level lineage support for snowflake by @mayurinehate in #6034
  • feat(ingest): looker - support for simple column level lineage by @shirshanka in #6084
  • fix(elastic-setup) Fixing env var logic by @pedro93 in #6079
  • Revert "chore(setup): change defaults for partitions (#6074)" by @pedro93 in #6086
  • fix(mae-consumer): fix regression on base64 encoding by @codesorcery in #6061
  • fix(elasticsearch) Analytics indices creation on AWS ES by @tomas-kubin in #5502
  • docs(ingest): note that Athena doesn't support lineage by @hsheth2 in #6081
  • fix(ingest): alias for mssql-odbc source by @hsheth2 in #6080
  • fix(ingest): presto-on-hive - Setting display name properly by @treff7es in #6065
  • fix(schema filter): fix schema infinite rerender by @gabe-lyons in #6082
  • feat(monitoring): track graphql errors in metrics by @szalai1 in #6087
  • feat(advanced search): Add component to show all advanced search filters & add new filter by @gabe-lyons in #6058
  • fix(ingest): bump lkml version by @hsheth2 in #6091
  • fix(ingest): lookml - extract column correctly by @shirshanka in #6093
  • feat(retention): change default policy, add API to apply retention by @anshbansal in #6088
  • fix(lineage): fix missed casing in lineage registry by @gabe-lyons in #6078
  • fix(ingest): bigquery-beta - Lowering a bit memory footprint of bigquery usage by @treff7es in #6095
  • feat(ingest): remove hardcoded env variable default for cli version by @shirshanka in #6075
  • docs: add information about mapping ports for datahub-gms by @shirshanka in #6092
  • chore(deps): upgrade graphql-java deps to 19.0 by @shirshanka in #6099
  • chore(deps): upgrade neo4j to 4.4.x by @shirshanka in #6101
  • feat(docs): Improve documentation about Search by @szalai1 in #5889
  • feat(ingest): add async option to ingest proposal endpoint by @RyanHolstien in #6097
  • chore(deps): upgrade opentelemetry dependencies by @shirshanka in #6100
  • refactor(recommendations): Bump default max recommendations count for Platforms by @jjoyce0510 in #6113
  • feat(ingest): add Sandbox support by @rgudic in #6105
  • fix(mae): use JAVA_TOOL_OPTIONS instead of JDK_JAVA_OPTIONS by @szalai1 in #6114
  • feat(advanced-search): Complete Advanced Search: backend changes & tying UI together by @gabe-lyons in #6068
  • feat(search): improved search snippet FE logic by @gabe-lyons in #6109
  • feat(ingest): add CorpUser and CorpGroup to the Python SDK by @ttaubermarshall-stripe in #5930
  • fix(ingest): hide deprecated path_spec option from config by @hsheth2 in #5944
  • feat(posts): add posts feature to DataHub by @aditya-radhakrishnan in #6110
  • fix(ingest): remove unused mysql golden file by @hsheth2 in #6106
  • fix(ingestion): fix percent change computation in stale_entity_removal by @rslanka in #6121
  • refactor(ingest): use pydantic utilities for NamingPattern by @hsheth2 in #6013
  • fix(ingest): presto-on-hive - not failing on Hive type parsing error by @treff7es in #6118
  • fix(ingest): ignore usage and operation for snowflake datasets withou… by @mayurinehate in #6112
  • refactor(ingest): remove typing workarounds by @hsheth2 in #6108
  • Added information about AUTH_OIDC_EXTRACT_GROUPS_ENABLED by @PrashantKhadke in #6120
  • feat(lineage): show fully qualified task name in lineage UI by @gabe-lyons in #6126
  • docs(tableau): adding a ingestion video by @shirshanka in #6124
  • Sending "getting started" direct to quickstart by @laulpogan in #6125
  • build: Update JNA for M1 Mac by @david-leifker in #6116
  • fix(ingest): bigquery-beta - fix for missing key error if dataset list was empty by @treff7es in #6133
  • fix(ingest): file - add configurability for counting all elements bef… by @shirshanka in #6136
  • Worked on the feature to update group title by @Ankit-Keshari-Vituity in #6047
  • fix(ingest): add trino package max version restriction by @hsheth2 in #6137
  • test(KafkaEmitter): Enable ability to run test locally by @david-leifker in #6123
  • fix(ingest): add column name quoting for approximate count distinct by @hsheth2 in #6107
  • fix(ingestion): add fallback to trino by @IceS2 in #6044
  • perf(search): temporarily disable fetching input fields for search results by @gabe-lyons in #6139
  • feat(lineage) Add Column-Level to Lineage Visualization by @chriscollins3456 in #6138
  • feat(tracking): add telemetry for frontend events by @aditya-radhakrishnan in #6129
  • docs(approvals): update approval permission docs by @gabe-lyons in #6143
  • fix(ingest): fetch workbook tags in workbooks graphql query by @mayurinehate in #6102
  • fix(lineage) Fix possible null pointer exception in UpstreamLineagesMapper by @chriscollins3456 in #6147
  • fix(ingest): bigquery-beta - Eliminate the need for data.read permission for table schema by @treff7es in #6146
  • fix(lineage) Fix batching to ES for impact analysis by @chriscollins3456 in #6149
  • feat(ingest/lookml): add support for local/remote dependencies by @hsheth2 in #6150
  • fix(auth): fix login endpoint to respect session expiration env var by @aditya-radhakrishnan in #6151
  • fix(impact analysis): fixing filtering on impact analysis + cypress tests by @gabe-lyons in #6152
  • docs(favicon): add docs for customizing favicon by @gabe-lyons in #6155
  • fix(ingest): bigquery-beta - ensure that status aspect is emitted for… by @shirshanka in #6154
  • fix(ingest): bigquery - Fix syntax error in get_all_schema_tables_query by @hieunt-itfoss in #6159
  • fix(ingest): allow snowflake profiling to work with geography type by @mayurinehate in #6162
  • feat(ingest): support enabled flag for airflow config by @hsheth2 in #6089
  • refactor(ingest): Tableau cleanup by @hsheth2 in #6131
  • fix(ingest): bigquery-beta - turning sql parsing off in lineage extraction by @treff7es in #6163
  • fix(ingest): allow hiding some fields from the schema by @hsheth2 in #6077

New Contributors

Full Changelog: v0.8.45...v0.9.0