Releases: datahub-project/datahub
Releases · datahub-project/datahub
DataHub v0.7.1
Notable Highlights
- Lineage Visualization
- Pipelines and Tasks, Flows and Jobs
- Airflow Lineage
- Editable Field Descriptions
- Nested Schema Viz
- Search Improvements
- datahub CLI
- Official PyPi packages
- Production-quality Helm scripts
- New Integrations
- Officially-supported Sources: Airflow, AWS Glue, dbt, Druid, Superset, MongoDB, Oracle
Changelog
- #2440 @dexter-mh-lee feat(k8s): Move helm charts out of contrib
- #2397 @gabe-lyons feat(lineage): implement support for datasets, charts and dashboards downstream lineage fetching in a generic way
- #2434 @adriaanslechten feat(ingest) LDAP groups ingestion
- #2438 @hsheth2 fix(ingest): use entrypoints lib instead of pkg_resources
- #2425 @gabe-lyons feat(ingest): adding superset ingestion source
- #2433 @topwebtek7 fix(react): fix lineage sidebar buttons
- #2436 @hsheth2 fix(ingest): support custom snowflake types
- #2419 @topwebtek7 feat(react): add dataJob, dataFlow entity pages, refactor with fragments
- #2418 @frsann Fix(search): fix datajob and dataflow search mappings
- #2429 @hsheth2 fix(ingest): fix chart type enum serialization and add tests for rest emitter
- #2431 @shirshanka docs: Update agenda for Apr 23 townhall
- #2427 @hsheth2 fix(ingest): ensure upstreams in airflow lineage emission are entities
- #2426 @hsheth2 fix(ingest): include database info for snowflake
- #2424 @hsheth2 feat: add s3 data platform and logo
- #2423 @topwebtek7 feat(react): schema visualization add support for nested structs
- #2422 @topwebtek7 fix(react): lineage sidebar buttons should refer to the selected entity
- #2421 @dexter-mh-lee fix(kafka-setup): Fix start script for kafka setup
- #2417 @topwebtek7 feat(react): update dataset entity default svg icon
- #2411 @thomasplarsson feature(ingestion): Adding the concept of transformers
- #2415 @dexter-mh-lee fix(k8s): Add credentials to kafka-setup job and clean up
- #2412 @hsheth2 feat(ingest): add Kafka-based emitter example
- #2413 @gabe-lyons fix(lineage): allow lineage viz to handle circular dependencies
- #2414 @dexter-mh-lee fix(kafka-setup): Add the correct context to the git workflow for pushing kafka-setup image
- #2403 @hsheth2 fix(ingest): bump avro-gen3
- #2406 @topwebtek7 feat(react): use default entity icon if lineageentity has no icon
- #2408 @hsheth2 fix(ingest): properly handle fieldDiscriminator with restli
- #2409 @hsheth2 fix(ingest): add sqlalchemy extra
- #2398 @G-nther feat(kafka-setup): add option for SSL and topic partition config via environment
- #2404 @dexter-mh-lee feat(k8s): add extraEnvs to setup jobs
- #2407 @topwebtek7 feat(react): add footer buttons in lineage sidebar
- #2405 @thomasplarsson feature(ingestion): Make origin/fabric_type configurable
- #2384 @topwebtek7 feat(react): add padding between tags and description on datasets profile page
- #2396 @gabe-lyons feat(sample): adding sample mces for dataflows and datajobs
- #2400 @hsheth2 fix(ingest): streamline codegen init methods
- #2382 @topwebtek7 feat(react): update schema table to have fixed description column, set line break with max description width
- #2402 @dexter-mh-lee fix: Fix env variable setup for kafka, mysql-setup docker containers
- #2401 @hsheth2 fix(ingest): add db name to postgres URNs
- #2393 @hsheth2 fix(ingest): enable mypy
disallow_incomplete_defs
anddisallow_untyped_decorators
- #2395 @gabe-lyons fix(react): fix access to pictureLink in charts and dashboards
- #2399 @gabe-lyons fix(tags): check description existence on tags
- #2383 @topwebtek7 feat(react): fix long descriptions overflow issue in lineage side panel
- #2392 @hsheth2 refactor(ingest): update test harness to use a compose file per test
- #2391 @topwebtek7 feat(react): fix browse link of last breadcrumb linked to unknown page
- #2385 @dexter-mh-lee feat(mysql-setup): Add the ability to specify database name for mysql-setup
- #2389 @hsheth2 feat(ingest): add generic sqlalchemy source
- #2390 @dexter-mh-lee feat(k8s): Add ability to add service accounts to setup jobs
- #2387 @dexter-mh-lee fix(kafka-topic-convention): Fix DAOs that do not refer to TopicConvention
- #2386 @dexter-mh-lee feat(index): Add index naming convention for elasticsearch
- #2388 @hsheth2 fix(ingest): report correct version status in dev mode
- #2368 @hsheth2 feat(ingest): add Airflow lineage backend
- #2380 @OddCN fix(docs): fix config example for file sink
- #2362 @dexter-mh-lee feat(k8s): Update pods with correct probes and remove unnecessary dependencies
- #2372 @thomasplarsson fix(ingestion): dont crash on non-RecordSchema topics
- #2360 @hsheth2 docs(ingestion): remove outdated data-source-onboarding.md docs
- #2376 @topwebtek7 feat(react): hide Owned By label in card if no owners
- #2373 @shubham49 fix(react): ownership rendering
- #2377 @topwebtek7 feat(react): add null state indicator in user profile when no entities
- #2379 @topwebtek7 feat(react): update avatar to use initial if no image, refactor all avatars with custom one
- #2369 @gabe-lyons feat(lineage): support arbitrary entity types in lineage viz
- #2364 @thomasplarsson fix(ingestion): Support mapping from avro "boolean" and "map" types t…
- #2343 @thomasplarsson fix(ingestion): properly detect optional fields in avro schemas
- #2370 @topwebtek7 feat(react): add empty state UI for browse when no entities
- #2242 @frsann feat(datajob): Datajob graphql query
- #2367 @topwebtek7 feat(react): add dropdown menu links, menu styling, removed warnings
- #2365 @frsann chore(dependabot): Update pyyaml version
- #2366 @topwebtek7 feat(react): add icons on entities, updated styling in LineageViz
- #2351 @hsheth2 fix(ingest): add test for avro serialization and deserialization
- #2361 @hsheth2 feat(cli): Add support for checking docker memory usage
- #2358 @topwebtek7 feat(react): original description shows in edit modal even when the description has been updated
- #2357 @gabe-lyons feat(react): improving error logging on dataset entity
- #2356 @dexter-mh-lee fix(elasticsearch): Fix inconsistencies between documents and elasticsearch mappings
- #2359 @hsheth2 fix(ingest): support
python3 -m datahub
- #2353 @hsheth2 chore(ingest): remove unused
plugin_requirements.txt
file - #2352 @hsheth2 fix(ingest): bump pybigquery version
- #2350 @hsheth2 fix(ingest): support
datahub --version
- #2349 @gabe-lyons feat(lineage): improve lineage re-focus experience
- #2341 @frsann feat(tags): Add tag graph builder
- #2348 @jjoyce0510 fix(Ember App): Allow ember build (disabled by default)
- #2345 @hsheth2 fix(cli): add --verbose flag for
datahub check plugins
- #2346 @gabe-lyons fix(lineage): add upstream arrows back in
- #2347 @hsheth2 feat(ingest): add Oracle db support
- #2336 @topwebtek7 feat(react): add description edit behavior along with modal
- #2340 @gabe-lyons feat(lineage): adding ghost edges indicating hidden dependencies
- #2331 @hsheth2 feat(ingest): start airflow integration + metadata builders
- #2339 @hsheth2 fix(ingest): add support for database and table patterns to glue source
- #2338 @hsheth2 fix(docker): remove
restart: always
from docker-compose for consistency - #2335 @gabe-lyons feat(lineage): adding directionality to lineage edges to make the visualization more clear
- #2337 @gabe-lyons fix(lineage): fixing lineage layout bugs
- #2319 @amonkhouse feat(ingest): adding support for AWS Glue
- #2312 @shakti-garg feat(es-setup): add logic in elasticsearch setup to compare-and-update index if already exists
- #2333 @gabe-lyons feat(lineage): expandable lineage visualization for dataset <> dataset lineage
- #2332 @hsheth2 docs: add wolt logo to frontpage
- #2315 @grantatspothero feat(ingest): adds experimental support for ingesting Looker metadata
- #2330 @luck02 fix(test): dbt-manifest files
- #2329 @topwebtek7 feat(react): moving filter panel from modal to drawer
- #2328 @hsheth2 build: remove deprecated ember app from build
- #2327 @hsheth2 feat(ingest): verify dynamic registry types at runtime
- #2316 @joemirizio feat(ingest): dynamically register plugins
- #2325 @hsheth2 fix(ingest): remove outdated metadata-ingestion scripts
- #2313 @shakti-garg fix(k8s): make es-setup job parameters more contextual
- #2322 @gabe-lyons docs(theme): making
yarn start
instructions more explicit - #2317 @hsheth2 doc: update slack links to https
- #2324 @frsann fix(datajob): Fix URI templates for datajob and dataflow
- #2320 @frsann fix(tags): Support creating tags with MCE
- #2323 @arunvasudevan fix(docs): Update metadata-serving.md
- #2318 @dexter-mh-lee fix(docker): Fix issue in gms start.sh
- #2321 @shirshanka docs: Update next townhall details, fixup links and misc docs
- #2251 @bernardino feat(Kubernetes): Add JMX exporter containers to all DataHub components
- #2308 @dexter-mh-lee fix(search): Fix styling for column match snippet
- #2302 @shakti-garg feat(k8s): Add k8s hook in datahub helm chart for setting up elasticsearch
- #2298 @dexter-mh-lee feat(docker): Add dockerfile for initializing an existing mysql server
- #2297 @shakti-garg feat(kafka-config): add variable KAFKA_CONSUMER_GROUP_ID to ove...
DataHub v0.7.0
Notable Highlights
- New React Application re-written from the ground up
- Support for GraphQL
- New Metadata Ingestion Framework (Python)
- Officially-supported Sources: Kafka, MySQL, SQL Server, Hive, Postgres, Snowflake, BigQuery, AWS Athena, Druid, LDAP
- New Homepage and Hosted Docs redesign at datahubproject.io
- Product Features: SSO (OIDC), Tags, Themes, Dashboards
- Metadata Backend Implementations: MLModel ecosystem, DataFlow ecosystem
- Move to Elasticsearch 7. Migration guide from 5.x here
Changelog
- #2263 @jplaisted feat(search) BREAKING Support ElasticSearch 7, drop ES5
- #2260 @gabe-lyons fix(tags): fixing margins on tags for long descriptions
- #2259 @hsheth2 docs: update roadmap progress
- #2258 @dexter-mh-lee refactor(demo): Add empty global tags to BigQuery demo data
- #2255 @jjoyce0510 feat(react): Adding shadow and deeper linear gradient
- #2254 @gabe-lyons feat(tags): improving elastic search templates for tags
- #2253 @gabe-lyons fix(tags): fix ownership on tag create
- #2256 @hsheth2 fix: update slack links
- #2248 @gabe-lyons feat(tags): editing tags from react client on datasets, schemas, charts & dashboards
- #2252 @jjoyce0510 refactor(react): React as the default UI
- #2246 @hsheth2 feat(ingest): various minor fixes
- #2245 @jjoyce0510 feat(react): Adding big query logo
- #2249 @gabe-lyons fix(react): enabling charts and dashboards to be supported by theme config
- #2235 @pedro93 feat(ingest): Add support for druid
- #2244 @gabe-lyons feat(react): moving schema tab to be default
- #2243 @shirshanka docs: adding mar-19 townhall agenda
- #2240 @dexter-mh-lee feat(tags): Enable search for datasets by tags
- #2236 @pedro93 feat(k8s): Add metadata-ingestion as a Helm component
- #2241 @shirshanka docs: Improving architecture docs
- #2239 @hsheth2 feat(docs): use gradle for building docs
- #2232 @hsheth2 fix(ingest): various avro codegen fixes
- #2237 @gabe-lyons fix(dataflow): fixing browse dao access
- #2166 @arunvasudevan feat: MLmodel Graphql Query
- #2197 @frsann feat(datajob): Backend implementation
- #2233 @jjoyce0510 refactor(react): All entity search UI + misc improvements
- #2234 @jjoyce0510 docs(react): Oidc React Doc Updates
- #2231 @dexter-mh-lee fix(docker): start issue when there are multiple kafka brokers in bootstrap config
- #2227 @jjoyce0510 refactor(React): Misc UI improvements
- #2230 @hsheth2 fix(ingest): pin version of avro-gen3
- #2226 @hsheth2 fix(ingest): use python extras in docker image
- #2224 @hsheth2 feat(ingest): use plugin system based on Python extras
- #2190 @jjoyce0510 feat(react): SSO support simple OIDC authentication
- #2223 @dexter-mh-lee Added images to es/kafka-setup
- #2222 @dexter-mh-lee fix(ci): rename file to match git workflow needs
- #2220 @dexter-mh-lee fix(ci): remove paths_ignore from workflow files
- #2219 @thomasplarsson refactor(ingest): improve athena source api and documentation
- #2221 @gabe-lyons fix(ci): setting CI to false for builds
- #2218 @gabe-lyons feat(react): hiding raw schema button when no raw schema exists
- #2216 @dexter-mh-lee fix(es-setup): Add git workflows to upload docker for elasticsearch and kafka setup
- #2213 @thomasplarsson feat(ingest): add aws athena ingestion source
- #2217 @gabe-lyons fix(ci): fail CI on react build errors
- #2215 @gabe-lyons fix(react): fix theming test in react and simplifying api
- #2209 @thomasplarsson feat(ingest): add option for optimized skipping of schemas
- #2212 @hsheth2 fix(ingestion): nullable types and timestamp precision
- #2207 @hsheth2 feat(ingest): standalone metadata emitters
- #2205 @dexter-mh-lee fix(ci): Fix github package path
- #2204 @dexter-mh-lee feat(ci): Add SHA based tagging before pushing to docker registries
- #2203 @gabe-lyons feat(tag): adding search for tags in gms layer
- #2193 @gabe-lyons feat(react): adding ability to support theming of datahub, with two themes included
- #2201 @hsheth2 feat: add date and time types to SQL model
- #2202 @thomasplarsson feat(mae-consumer): enable mae-consumer to use ssl when communicating with elasticsearch
- #2199 @thomasplarsson fix(mae-consumer): mae-consumer needs sslcontext bean
- #2181 @shirshanka chore: renaming business_glossary rfc directory with pull request number
- #2182 @shirshanka chore: renaming graphql_frontend rfc directory with pull request number
- #2183 @shirshanka chore: renaming react-app rfc directory with pull request number
- #2196 @shirshanka docs(roadmap): update project roadmap
- #2195 @jjoyce0510 fix(graphql): Add "fixed" SchemaFieldDataType mapping
- #2194 @gabe-lyons feat(tags): Enriching sample data for tags
- #2191 @hsheth2 feat(docs): automatically populate sidebar with RFCs
- #2192 @jplaisted (feat) Simple python script to carry over ES indices from 5 to 7.
- #2173 @brendansun93 feat(React): Ownership component of user profile
- #2189 @thomasplarsson feat(gms): add elasticsearch SSL support
- #2112 @frsann feat(tags): RFC for tags
- #2187 @gabe-lyons fix(react): fixing test issues that arose from ill-timed merges
- #2164 @gabe-lyons feat(tags): adding support for read/write of tags in gms & read-only in react datahub-frontend.
- #2185 @jjoyce0510 feat(graphql): More forgiving for unknown data platforms during reads
- #2184 @jjoyce0510 test(React): Home page tests
- #2186 @hsheth2 fix(docs): fix broken links
- #2179 @gabe-lyons feat(react): adding raw schema view option for table schemas
- #2178 @hsheth2 feat(ingest): bigquery sample data
- #2176 @hsheth2 docs: point to hosted docs site
- #2177 @hsheth2 docs(ingest): clarify setuptools requirement
- #2175 @hsheth2 build(docs): only deploy docs on main repo
- #2174 @hsheth2 docs: hosted documentation website
- #2167 @jjoyce0510 feat(React): Impl browse UI for Dashboards and Charts
- #2168 @jjoyce0510 fix(React): Fix Browse Pagination Bug
- #2172 @hsheth2 fix(ingest): loosen Kafka broker validation
- #2165 @jjoyce0510 feat(DataPlatform Logos): Adding server driven logos
- #2171 @hsheth2 docs(ingest): clarify Kafka connection config
- #2169 @shirshanka doc(townhall): Add links for Feb 19, upcoming townhall on Mar 19
- #2161 @hsheth2 fix(ingest): bigquery source and dataset naming fixes
- #2163 @jjoyce0510 fix(graphql): Bubbling up exceptions logged in GraphQL resolvers
- #2159 @hsheth2 build(ingest): use multi-stage docker build for datahub-ingestion
- #2157 @hsheth2 feat(ingest): capture table descriptions
- #2158 @hsheth2 feat(ingest): switch quickstart to Python ingestion
- #2156 @pedro93 feat(ingest): support alternative authentication in sql ingestion
- #2152 @gabe-lyons fix(react): fixing format we propagate filters to graphql in
- #2154 @gabe-lyons feat(react): Redirecting /assets to index
- #2151 @hsheth2 build(docker): add large generated directories to dockerignore
- #2150 @hsheth2 ci(ingest): setup docker container for metadata ingestion
- #2145 @RickardCardell feat: neo4j Bolt TLS support (#2100)
- #2143 @dexter-mh-lee feat(dashboards): Add browse end point for charts and dashboards
- #2144 @RickardCardell feat: neo4j https support (#2101)
- #2147 @gabe-lyons docs(frontend): Update docs to clarify running local frontend w/ local react app
- #2148 @jjoyce0510 feat(gms): Add optional data platform display name
- #2149 @jplaisted Switch GMA dep from bintray to artifactory.
- #2146 @jjoyce0510 Fixing required audit stamps bug
- #2140 @jjoyce0510 feat(React): Search page UI improvements, 'all' entity search.
- #2133 @thomasplarsson feat(datahub-dao): enable services to access gms over https
- #2136 @hsheth2 feat(ingest): support Postgres PostGIS extensions
- #2139 @gabe-lyons docs(Ownership): making lack of support for ownergroups in frontend explicit in pdl
- #2137 @dexter-mh-lee refactor(docker-dev): set up elasticsearch using local mapping on docker-compose.dev
- #2135 @hsheth2 ci(ingest): run apt update
- #2134 @hsheth2 refactor(ingest): cleanup configuration models
- #2130 @jjoyce0510 feat(React UI): SearchPage and SearchResultsPage
- #2132 @jjoyce0510 Add URL to dashboard / chart page
- #2131 @gabe-lyons fix(React): Adding test coverage for search page & fixing filter select bug
- #2128 @jjoyce0510 fix(react): Fix authenticated user profile
- #2125 @hsheth2 fix(ingest): gracefully handle unknown types
- #2127 @jjoyce0510 feat: Introducing optional DataPlatform logo url
- #2124 @hsheth2 fix(ingest): update sample MCEs based on MLModel changes
- #2126 @jjoyce0510 fix(gms): fix getAllDataPlatforms bug
- #2123 @hsheth2 docs(ingest): add solutions for common install issues
- #2122 @hsheth2 feat(ingest): add support for LDAP ingestion
- #2120 @hsheth2 test(ingest): verify the output of mssql
- #2119 @jjoyce0510 feat(React): Adding basic chart + dashboard UI
- #2115 @brendansun93 feat(React): Avatar dropdown menu and logout function
- #2121 @hsheth2 feat(ingest): improve error reporting for pipelines
- #2117 @jjoyce0510 feat(GraphQL API): GQL implementation of Charts + Dashboards
- #2118 @...
DataHub v0.6.1
Added
- #2021 Add a CODEOWNERS file @jplaisted
- #1884 feat(dashboard): Dashboards backend implementation @keremsahin1
- #2001 feat(dataset): Enable search of datasets by field names @nagarjunakanamarlapudi
- #1986 feat: enable SCSI for datasets @jywadhwani
- #1936 feat(field-level-lineage): Add models for field level lineage @nagarjunakanamarlapudi
- #1842 feat(business-glossary):RFC for Business Glossary @pmsrao
- #1985 add LocalDAOStorageConfigFactory for SCSI @jywadhwani
- #1978 add SCSI bootstrap script for datasets @jywadhwani
Changed
- #2027 fix: ingestion docker image @jplaisted
- #2022 Fix dataset index creation issue @nagarjunakanamarlapudi
- #2008 feat(models): Add DataFlow and DataJob models @hshahoss
- #2009 fix/docs(frontend): Syncs UI with internal frontend @cptran777
- #2016 docs: upload updated deck @mars-lan
- #2015 docs: update links @mars-lan
- #2011 Townhall agenda for December 4 @nagarjunakanamarlapudi
- #2007 Bump GMA to latest @jplaisted
- #2005 feat(kubernetes): Add pod-level annotations to the datahub helm charts @shakti-garg-saxo
- #2004 1995 | fix indentation value in helm deployment templates @shakti-garg-saxo
- #1999 Update doc for configuring topic names @shakti-garg-saxo
- #1979 refactor(gms): use BaseLocalDAO as the interface in factories & rest.li resources @mars-lan
- #1932 feat(dashboard): Dashboard models update @keremsahin1
- #1991 fix: fix build definition of DatasetFieldUrn @jplaisted
- #1977 [Breaking] Update to GMA 0.2.0 and fix Urn definitions. @jplaisted
- #1989 2020-10-10 Syncronizing datahub-web {COMMIT-SYNC:7f757e3a514fdeff1de922112f182386bd322228} @igbopie
- #1981 1604086049622-ui-sync @igbopie
- #1988 Updates to town hall history and next town hall @nagarjunakanamarlapudi
- #1987 docs: update UI credential requirement for Quickstart @shakti-garg-saxo
- #1982 docs: update agenda of town hall @nagarjunakanamarlapudi
DataHub v0.6.0
Added
- #1940 add aspects to VALUE model of datasets @jywadhwani
- #1820 feat(Azkaban entities): RFC for Azkaban Flows and Jobs @hshahoss
- #1841 feat(field-level-lineage): RFC for field-level-lineage @nagarjunakanamarlapudi
Changed
- #1972 refactor search index builder to store urn parts efficiently @jywadhwani
- #1971 test: improve test coverage for DatasetIndexBuilder. @jplaisted
- #1969 feat: enable default restli documentation @mars-lan
- #1968 fix: add placeholder for logging call parameter @claudio-benfatto
- #1955 refactor: move code to linkedin/datahub-gma. @jplaisted
- #1931 Bump to datahub-gma 0.1.0 @keremsahin1
- #1962 Update faq.md @pardhugunnam
- #1960 Upgrade neo4j to 4.0 @keremsahin1
- #1958 fix: validate entity type for an urn @jywadhwani
- #1950 fix(login): Fix login error when corp user editable information is not present. Fixes #1948 @nagarjunakanamarlapudi
- #1949 Moves remaining references to non-inclusive language @cptran777
- #1947 Catch up fe to internal - includes module consolidations for faster build times @cptran777
- #1944 docs: update links @mars-lan
- #1933 feat(frontend): Catchup frontend for internal development changes @cptran777
- #1939 datasets client to extend browsable client @jywadhwani
- #1938 Change favicon and logo to be datahub instead of linkedin @cptran777
- #1937 refactor search index builder to store urn parts efficiently @jywadhwani
- #1913 Update tab.ts @andrewkantor
- #1935 Fixes issue where user avatar reaches internal page and improves aspects fetching from UI @cptran777
- #1934 docs: correct search over new field docs @shubhamg931
- #1929 build(docker): use community version of ES & Kibana in quickstart @mars-lan
Deleted
- #1973 get rid of search mock utils @jywadhwani
- #1964 refactor: drop unused models to prevent drifts @mars-lan
DataHub v0.5.0
Added
- #1775 feat(dashboard): Dashboard metadata models @ksahin
- #1818 doc(rfc): Add requirements / non requirements section to RFC. @jplaisted
- #1805 Start adding java ETL examples, starting with kafka etl. @jplaisted
- #1812 feat(ML models): RFC for ML models @jywadhwani
- #1721 feat: add ML models @arunvasudevan
- #1859 feat(platform): add "postgres" as a supported data platform @mars-lan
- #1844 feat(frontend): Module consolidation for some test modules and reduces errors from unsupported API calls @catran
- #1837 feat: add MCE ingestion support for CorpGroup @mars-lan
- #1821 feat(frontend): Module consolidation - clean up for OS logic - init virtual assistant @catran
Changed
- #1927 Announce DataHub's participation in Hacktoberfest @nagarjunakanamarlapudi
- #1924 Update next townhall meeting id @nagarjunakanamarlapudi
- #1916 refactor(gms): reorganize GMS factory namespace @mars-lan
- #1921 Update of townhall schedule for the next quarter @nagarjunakanamarlapudi
- #1918 fix(metadata-ingestion): Fix auditStamp unix timestamp format in sql etl ingestion @grantatspothero
- #1914 docker: Run as non-root user in docker @frsann
- #1912 doc: update search-over-new-field.md @ibona
- #1905 Adds UI support for custom dataset properties @catran
- #1909 docs: Update for topic name configuration @jplaisted
- #1904 frontend code migration and unused code removal font update and minor improvements @catran
- #1894 Add new spring factories to customize metadata event topic names. @jplaisted
- #1903 docs: update links @mars-lan
- #1901 docs: add Budapest talk @mars-lan
- #1900 build: fix build by adding zookeeper dependency explicitly @mars-lan
- #1898 Bump up kafkaAvroSerde to support SSL for Schema Registry @themightylaz
- #1899 fix(docker): update mae and mce consumer images to include glibc compat layer. allows the consumer jobs to deal with snappy compressed kafka topics when running on alpine linux @grantatspothero
- #1895 [BREAKING] Break dependency of ebean-dao on metadata-models. @jplaisted
- #1897 docs: update town hall history @mars-lan
- #1893 add default KAFKA_BOOTSTRAP_SERVER @liangjun-jiang
- #1871 feat: Port mce-cli to Java. @jplaisted
- #1889 fix (docker): Fix install of Chrome in frontend Dockerimage @frsann
- #1873 build: add failure notification on push @mars-lan
- #1881 Adds ability for midtier to serve custom dataset properties from aspect @catran
- #1880 Fixes current user entity not being populated correctly @catran
- #1874 fix (frontend): Partially fixes lineage issues and dataset API handling @catran
- #1872 build: fix build @mars-lan
- #1868 Small fixes to mce_cli @jplaisted
- #1863 fix(gms): update kafka client libraries to a newer version to support schema registry basic auth + SSL @grantatspothero
- #1857 1849 support ssl to mce cli.py @fabiofilz
- #1839 fix(ingestion): set schema registry URL correctly for FMCE producer @mars-lan
- #1838 build(node): replace broken & unmaintained gradle node plugin @mars-lan
- #1835 Pushing internal consolidation of modules to open source @catran
- #1828 docs: add external link @mars-lan
Removed
- #1925 remove CorpUsersClient file @jywadhwani
DataHub v0.5.0-beta
Changed
- #1806 Updated the frontend code. The frontend code was very far (> 6 months) behind the internal frontend code. We're not caught up yet, hence the BETA release, but we did go pretty far. Major refactorings were included.
Added
DataHub v0.4.3
Added
- #1782 improve security of k8s / helm charts
- #1791 Add description of dataset to the search index
- #1803 Add an example crawler for MS SQL
- #1811 Sync our internal backend code externally to HEAD (we're caught up now!)
- Added
ESBulkWriterDAO
to bulk write to ElasticSearch. Planned usage is for integration tests. - Add Strongly Consistent Secondary Index (SCSI) Implementation for MySQL.
- Start adding code to generate aspect-entity specific metadata events, rather than our current single event approach.
- Add support in the GMS to ask for no aspects on entities by setting the aspectNames param to null (omitting the param is still considered as asking for all aspects). Useful if checking the existence of an entity to avoid a large response (i.e. performing a search to just get URNs back, and nothing else).
- Added
Changed
- #1777 Add docker files for development
Fixed
Fixed
- #1808 Clear dataset description from search index when cleared in source
DataHub v0.4.2
Added
- #1711 feature(ingest): add bigquery ETL script @mars-lan
- #1712 feat(ingest): add PostgreSQL ETL script @mars-lan
- #1713 feat(ingest): replace custom hive-etl with sql-based ETL @mars-lan
- #1714 feat(ingest): add snowflake ETL script @mars-lan
- #1706 Implemented data process search feature @liangjun-jiang
- #1742 feat(gms): add postgres & mariadb supports to GMS @mars-lan
- #1752 build: build GitHub Pages from /docs directory @mars-lan
- #1745 feat(kafka-config): Add ability to configure other Kafka props @jsotelo
- #1754 Add documentation around the DataHub RFC process @jplaisted
Changed
- #1710 Refactor all ETL scripts to using Python 3 exclusively @mars-lan
- #1733 refactor(models): remove internal cluster model @hshahoss
- #1756 metadata-models 72.0.8 -> 80.0.0 @jywadhwani
- #1757 docs: add a sequence diagram and a description @liangjun-jiang
Removed
Fixed
- #1716 fix(py3): Bump ingestion Docker py dependency to 3.6 @keremsahin1
- #1726 fix: modify the etl script dependency @cobolbaby
- #1727 fix: correct the way to catch the exception @cobolbaby
- #1758 fix(ingestions): align the default kafka topics with PR @RealChrisL
DataHub v0.4.1
Added
- #1680 Data process entity @liangjun-jiang
- #1695 Implement data process graph feature
- #1708 feature(etl): add SQLAlchemy-based ingestion script @mars-lan
- #1707 Support for volta in web client @cptran777
- bbf7545 build: parallelize docker image builds @mars-lan
Changed
- #1700 Add missing updates from recent internal push @keremsahin1
- #1693 metadata-models 62.0.3 -> 72.0.8 @jywadhwani
- #1687 build(docker): refactor docker build scripts @mars-lan
- #1690 build(docker): refactor ingestion docker build script @mars-lan
- #1691 upgrade the version of neo4j @jywadhwani
- #1685 move the gradle plugin version to top level build.gradle @jywadhwani
- 63943a1 build: update workflows to build version-tagged docker images upon new release @mars-lan
Fixed
- #1697 fix: remove helm container command @jsotelo
- #1698 fix: add missing neo4j.host helm var @jsotelo
- #1709 [fix] load default picture link if not present @jywadhwani
- #1704 fix-DatasetSearchConfig class ref @geosmart
- f79b2c9 fix(ingestion): Fix sample MCE for data process @keremsahin1
- 867dbd0 fix: use tuple notations for union types @mars-lan
DataHub v0.4.0
Added
- #1568 Allow to store Quickstart dockers data in a folder for persistence @afranzi
- #1602 feat: support for Kubernetes-based deployment @bharatak
- #1608 add lineage hive @clojurians-org
- #1609 add support for kubernetes helm packaging @bharatak
- #1611 init jdbc generator @clojurians-org
- #1613 add oracle driver @clojurians-org
- #1629 feat: Converting MCE to a Spring boot Application @arunvasudevan
- #1635 feat: convert MAE application to springboot @arunvasudevan
- #1637 add postgresql support and force utf8 encode on non-utf8 locale @clojurians-org
- #1647 Add openldap-etl script and instruction @loftyet
- #1673 add DataProcess Urn @loftyet
- #1678 refactor(pdl): convert all pdsc to pdl @mars-lan
- #1677 feat(urn): add AzkabanFlow and AzkabanJob urn @hshahoss
Changed
- #1601 build: bypass testing datahub-web when running idea gradle task @mars-lan
- 6ab2ab6 build(mysql): Change mysql dependency from latest to 5.7 @keremsahin1
- #1610 metadata-models 54.0.1 -> 58.0.1 @jywadhwani
- #1616 metadata-models 58.0.1 -> 62.0.3 @jywadhwani
- #1619 refactor(gms): move gms restli resources @jywadhwani
- #1624 build(gms): rename JettyRunWar task to run @mars-lan
- #1626 refactor(frontend): fails loudly to help debug gms issue @mars-lan
- #1633 add field for ui and parser reference @clojurians-org
- #1641 migrate hive generator @clojurians-org
- #1662 style: add checkstyle and IDEA code style config @mars-lan
- #1664 build: update pegasus to v28 to add PDL support @mars-lan
- #1667 refactor: change the default log location @mars-lan
- #1669 refactor: use named volume instead of bind mount in quickstart @mars-lan
Deprecated
Removed
Fixed
- #1605 specify explicit avro lib for compatibility issue @jhsenjaliya
- d1cf628 Fix: Docker Quickstart - Sample Data Loading Error @RealChrisL
- ba33c7a Specify python version in mce-cli requirement.txt @RealChrisL
- #1621 fix: elasticsearch not starting on Mac @mars-lan
- #1622 build: pegasus plugin doesn't work well with gradle caching @mars-lan
- #1625 fix(gms): unable to find registered resources @mars-lan
- #1630 fix: Reduce gms & frontend docker image sizes @keremsahin1
- #1631 fix(Docker): Fixing 'dockerize not found' issue while starting @keremsahin1
- #1632 fix: Reduce mae-consumer & mce-consumer docker image sizes @bharatak
- #1646 fix(metadata-ingestion): pass schema_record to mce-cli cosumer @RealChrisL
- #1657 fix(quickstart): set utf8mb4 for mysql @e11it
- #1661 fix(urn): Move UrnCoercer into corresponding Urn class @mars-lan
- #1665 fix: use semantic instead of literal comparison in DefaultEqualityTester @mars-lan
- #1670 build: start enforcing checkstyle and fix all violations @mars-lan
- #1672 fix(frontend): Extract lastModified field from downstream/upstream aspect @keremsahin1