Data Discovery and Lineage for Big Data Ecosystem
Clone or download
mars-lan Update commit hash (#1445)
MP=wherehows-frontend
TO_COMMIT=7d0a4beca69099a758a0c80a292878fefce9c4e9
FROM_COMMIT=9eae6d8bced135f4ffd59e950ecfb412aaf67030
Latest commit 04fa089 Oct 13, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
gradle Update the build system to work with the new open source process (#1438) Oct 6, 2018
wherehows-backend Update Scala to 2.11, fix travis build (#1316) Aug 13, 2018
wherehows-common Drop extra white lines (#1442) Oct 11, 2018
wherehows-dao Add actor and modified time to export policy class Oct 4, 2018
wherehows-data-model add fabric facet for search Sep 10, 2018
wherehows-docker More cleanup to get rid of WHZ_LDAP_* references (#1309) Aug 11, 2018
wherehows-docs Update getting-started.md Sep 5, 2017
wherehows-etl Disable LDAP ETL group flattening due to timeout (#1320) Aug 13, 2018
wherehows-frontend Remove extra spaces (#1440) Oct 11, 2018
wherehows-hadoop Modify dependencies and versions, tune log for MLE (#1157) May 14, 2018
wherehows-ingestion remove wherehows-common from wherehows-ingestion dependency (#1378) Sep 13, 2018
wherehows-kafka remove wherehows-common from wherehows-ingestion dependency (#1378) Sep 13, 2018
wherehows-web Upgrade husky (#1444) Oct 12, 2018
.gitignore Remove unused avro schemas, move avro resolution (#735) Sep 11, 2017
.travis.yml Update Scala to 2.11, fix travis build (#1316) Aug 13, 2018
LICENSE Initial commit Nov 19, 2015
NOTICE add git file commit history etl Dec 11, 2015
README.md test Jan 17, 2018
build.gradle Update Gralde to 4.9 & Scala to 2.11 (#1315) Aug 11, 2018
gradlew Update gradle version to 4.0.2 (#627) Jul 30, 2017
gradlew.bat Update gradle version to 4.0.2 (#627) Jul 30, 2017
settings.gradle Move kafka processors to module wherehows-ingestion (#1377) Sep 12, 2018

README.md

WhereHows Build Status latest Gitter PRs Welcome

WhereHows is a data discovery and lineage tool built at LinkedIn. It integrates with all the major data processing systems and collects both catalog and operational metadata from them.

Within the central metadata repository, WhereHows curates, associates, and surfaces the metadata information through two interfaces:

  • a web application that enables data & linage discovery, and community collaboration
  • an API endpoint that empowers automation of data processes/applications

WhereHows serves as the single platform that:

  • links data objects with people and processes
  • enables crowdsourcing for data knowledge
  • provides data governance and provenance based on ownership and lineage

Who Uses WhereHows?

Here is a list of companies known to use WhereHows. Let us know if we missed your company!

How Is WhereHows Used?

How WhereHows is used inside of LinkedIn and other potential use cases.

Documentation

The detailed information can be found in the Wiki

Examples in VM (Deprecated)

There is a pre-built vmware image (about 11GB) to quickly demonstrate the functionality of WhereHows. Check out the VM Guide

WhereHows Docker

Docker can provide configuration free dev/production setup quickly, please check out Docker Getting Start Guide

Getting Started

New to Wherehows? Check out the Getting Started Guide

Preparation

First, please setup the metadata repository in MySQL.

CREATE DATABASE wherehows
  DEFAULT CHARACTER SET utf8
  DEFAULT COLLATE utf8_general_ci;

CREATE USER 'wherehows';
SET PASSWORD FOR 'wherehows' = PASSWORD('wherehows');
GRANT ALL ON wherehows.* TO 'wherehows'

Execute the DDL files to create the required repository tables in wherehows database.

Build

  1. Get the source code: git clone https://github.com/linkedin/WhereHows.git
  2. Put a few 3rd-party jar files to wherehows-etl/extralibs directory. Some of these jar files may not be available in Maven Central or Artifactory. See the download instrucitons for more detail. cd WhereHows/wherehows-etl/extralibs
  3. From the WhereHows root directory and build all the modules: ./gradlew build
  4. Start the metadata ETL and API service: ./gradlew wherehows-backend:runPlayBinary
  5. In a new terminal, start the web front-end: ./gradlew wherehows-frontend:runPlayBinary. The WhereHows UI is available at http://localhost:9001 by default. You can change the port number by editing the value of project.ext.httpPort in wherehows-frontend/build.gradle.

Roadmap

Check out the current roadmap for WhereHows.

Contribute

Want to contribute? Check out the Contributors Guide

Community

Want help? Check out the Gitter chat room and Google Groups