Apache Sqoop is a command-line tool designed for efficiently transferring bulk data between Apache Hadoop and structured data stores such as relational databases. Note: Apache Sqoop has been retired to the Apache Attic as of 2021. Users are encouraged to migrate to Apache Spark or Apache NiFi.
URL: https://sqoop.apache.org/
Run: Capabilities Using Naftiko
- Big Data, Data Transfer, ETL, Hadoop, RDBMS, Retired
- Created: 2026-03-16
- Modified: 2026-04-19
Apache Sqoop provides a command-line interface for bulk data transfer between Hadoop and relational databases including sqoop-import, sqoop-export, sqoop-job, and sqoop-eval commands.
Human URL: https://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html
- CLI, Data Transfer, ETL, Hadoop, RDBMS
| Name | Description |
|---|---|
| Bulk Import | High-throughput parallel import from RDBMS to HDFS, Hive, or HBase. |
| Bulk Export | Export data from HDFS back to relational database tables. |
| Incremental Loads | Delta-based incremental loading using append or lastmodified strategies. |
| Direct Import Mode | Native database utility-based transfers for MySQL and PostgreSQL. |
| Hive Integration | Auto-create Hive tables and load imported data directly into Hive. |
| Name | Description |
|---|---|
| Data Warehouse Loading | Load relational database data into Hadoop-based data warehouses. |
| Database Offloading | Move historical data from RDBMS to HDFS for cost-effective storage. |
| Name | Description |
|---|---|
| Apache Hadoop | Primary target storage for Sqoop imports via HDFS. |
| Apache Hive | Create and populate Hive tables from RDBMS imports. |
| MySQL | MySQL JDBC and direct mysqldump-based connector. |
| Oracle | Oracle JDBC connector for enterprise database data transfer. |
FN: Kin Lane
Email: info@apievangelist.com