Skip to content

Module 1: ICP #6

Sneha Mishra edited this page Jun 30, 2018 · 1 revision

Team: 12
Professor: Yugyung Lee

Name: Sneha Mishra
Class ID: 11
Email: smccr@mail.umkc.edu
MyGitHub

Technical Partner:
Name: Aditya Soman
Class ID: 19
Email: aditya.soman@mail.umkc.edu
GitHub

Objective

Introduction to Sqoop.

Features

  1. Install Sqoop.
  2. Use Sqoop to import and export mySQL Tables to HDFS.
  3. Create Hive Tables through HQL script, use Sqoop to import and export tables to Relational Databases.
  4. Perform three queries from databases.

Steps:

Step 1: Install Sqoop (or Cloudera)

Step 2: Part 1

Import and export mySQL Tables to HDFS

Step 2: Part 2

Import and export tables to Relational Databases

Step 2: Part 3

Perform three queries from databases

References:

  1. https://dzone.com/articles/sqoop-import-data-from-mysql-to-hive
  2. https://stackoverflow.com/questions/22404641/using-sqoop-to-import-data-from-mysql-to-hive
  3. https://stackoverflow.com/questions/23472688/data-import-from-mysql-with-apache-sqoop-error-no-manager-for-connect-string
  4. https://stackoverflow.com/questions/26515700/mysql-jdbc-driver-5-1-33-time-zone-issue
  5. http://community.cloudera.com/t5/Hadoop-101-Training-Quickstart/I-run-a-Hadoop-job-but-it-got-stucked-and-nothing-is/td-p/47856/page/2
  6. https://stackoverflow.com/questions/27282535/sqoop-cannot-find-mysqldump-when-using-direct-import-into-hdfs
Clone this wiki locally