Skip to content

Commit

Permalink
Redirect ELT to blog (meltano#6681)
Browse files Browse the repository at this point in the history
* Redirect ELT to blog

- Following Taylor's instructions from meltano#6679 to make the ELT page in docs redirect to our blog post, instead of deleting it.
- Why we're doing this (SEO updates): https://github.com/meltano/internal-marketing/issues/494

* Update docs/src/_concepts/elt.md

Co-authored-by: Taylor A. Murphy <taylor@meltano.com>
  • Loading branch information
lauraperezco and tayloramurphy committed Aug 29, 2022
1 parent 9f16f32 commit 4115c4a
Showing 1 changed file with 3 additions and 74 deletions.
77 changes: 3 additions & 74 deletions docs/src/_concepts/elt.md
Original file line number Diff line number Diff line change
@@ -1,78 +1,7 @@
---
title: "ELT vs ETL"
description: "ELT Meaning, Advantages & More: All You Need To Know To Get Started"
layout: doc
weight: 11
redirect_to: https://meltano.com/blog/etl-vs-elt-which-is-better/
hidden: true
---

Just like ETL, ELT is a data integration process found in most data management processes.

When building data infrastructure and selecting a tool stack, data engineers and even software developers often find themselves at a crossroads: what data integration process would work best? And what tools are best for the job?

On one hand, there is the familiar and reliable ETL process that has been a staple of most data management systems since the 1990s. On the other, there is the more modern ELT process that has several advantages over ETL, yet it doesn’t have quite the name recognition.

**Here, we’ll settle the dispute between ELT and ETL once and for all by:**

* Defining the ELT meaning and explaining the way it works
* Outlining the key differences between ELT and ETL
* Exploring the typical use cases of the ELT approach

### [Meltano ELT Today](/reference/command-line-interface#elt)

## What Is ELT?
ELT means Extract, Load, Transform. It is a method of data replication and transformation used to perform data integration at any scale.
The purpose of ELT is to extract specific data, such as customer information or billing records, from its source (a CRM system or an SQL database, for instance), and deliver it to its end point in the fastest, most reliable way possible.
As you can imagine, the name ELT means that this process consists of three steps:
* Extracting the raw, unorganized data from the source location
* Loading the unprocessed data into its target destination, such as a data warehouse
* Transforming the data into an accessible and comprehensible format at its target destination, such as an analytics report

## ETL vs. ELT: What’s The Difference?
ELT is often used interchangeably with ETL. However, the two approaches have a few notable differences which directly influence their benefits and use cases.
* The first notable and key difference between the two methods is apparent in their names. ELT stands for Extract, Load, Transform, which is a distinct process from ETL which stands for Extract, Transform, Load. To get a better idea of the distinction between ETL and ELT means, let’s take a closer look at the logic behind each approach.
* The first step of each method is the same. Step one requires extracting the raw source data from its original environment, such as a CRM system or any data storage. Once the data has been extracted, the integration process begins — and that’s where the differences between the two approaches start.
* The mechanics behind ELT mean that the data is loaded directly into the destination data warehouse or data lake, where it is transformed before being exported in the end-user format. The key detail is that transformation happens after the replication steps are complete. This is the key benefit we will touch on later.
* In ETL, however, the transformation step happens earlier in a separate staging environment. After the ETL interprets, cleans up and organizes the data in this environment, it loads the ready-made information into the destination storage where end users can access it. The ETL process requires the transformation of data to happen while it is in midflight.

The ultimate goal of both the ETL and ELT process is to transform raw, unprocessed data into valuable and comprehensible insights, such as analytics reports. Both methods achieve this but the latter is preferable for the following reasons.

## 4 Key Advantages ELT Has Over ETL
While the two processes have their similarities, ELT has a few clear advantages over ETL, thanks to the inherent efficiency, reliability and scalability built into the flow. Let's look at four key points that make ELT the superior process between the two.
#### Advantage #1: Less Destructive
You don't want that to be a one-time or destructive process that can't be repeated or that can't be iterated upon.
#### Advantage #2: Repeatable
The transformation is where you want to apply your business logic you're cleansing. And the important thing to think about is that transforms your business logic and your cleansing is going to improve constantly.
#### Advantage #3: Fewer Errors and Duplicate Work
When you do the transformation in the middle, then you're in, you're imposing your business logic on the data. While it's in flight, as your business laws changes, you have no way to rerun what you ran at that point in time. You're also going to have a lot more failures.
#### Advantage #4: Collaborative and Cost Effective
What is helpful about ELT is that you can have a kind of virtuous downstream cycle. The example being, I extract and load transform and publish. But my teammate also needs to extract from me and they can have the same process and so on and so on and so on. So you can have a core team doing ELT. They can also have subscribing teams which can have subscribing teams and so on and so on.
The benefit of this ETL process here is that they have an official record of whatever we published on that day. You are able to restate everything to your downstream consumer who will have some record of what I published on that day. And that's one of the reasons why we have a separate EL step vs including the transformation in with the replication.

## When To Use The ELT Process
Because of ELT’s unique features and advantages, it is superior to ETL as a data processing protocol in a variety of use scenarios and projects. But when would ETL benefit you the most?

Here are a few scenarios when ELT would work best for you:
* **You run a small to medium-sized enterprise:** Let’s start with the most obvious scenario that has to do with one of ELT’s key advantages: cost. The truth is that ETL, despite all its advantages, can be too costly of a solution for organizations that don’t have enough resources to operate a maintenance-heavy system. By comparison, ELT is a much more affordable solution that can be more suitable for small or medium-sized organizations that don’t have the resources or need for a costly solution.
* **You need to process a large data set, quickly:** Due to its inherent scalable cloud computing abilities, ELT is more capable at loading and processing large data sets. By comparison, ETL is best suited for smaller data chunks and can be more sensitive to the data quality and formatting due to the limitation of the staging transformation environment.
* **You need to keep all your old data at hand:** Processing your data using ELT means that once it reaches your data warehouse or storage and gets processed, it will stay there indefinitely — as long as the storage capacity allows it. This can be a crucial factor for organizations that prefer to keep all their data in one place, including its unprocessed raw form. With ELT, you will always be able to access your old data in case you need to use it at any point in the future.

In contrast to many ETL solutions, ELT is cloud-based and does not require as much human input for updates and maintenance.

## Is Meltano An ELT Tool?
Meltano is an open-source ELT platform powered by:
* Singer’s existing library of over 250 community-maintained data extractors and loaders
* dbt’s transformation protocols
* Airflow’s workflow orchestration framework

Aimed at data consultants, engineers and developers of data products, Meltano is a self-hosted solution that can be integrated into any local machine or production environment and managed via a built-in command line interface (CLI).
True to our open-source roots, Meltano is supported by a global network of 1,900+ data specialists at startups and Fortune 500 companies alike. Our goal is to become the leading DataOps Ops and enable our users to build next-generation data stacks.

## Key ELT Facts To Remember
At the heart of every data management infrastructure lies a process that transforms the data from an unorganized collection of entries into comprehensible reports and insights.
This process usually goes by one of the following two names — ETL which means Extract, Transform, Load or ELT which means Extract, Load, Transform. And while the two names and the mechanics behind them are similar, the two processes couldn’t be more different.
While ETL is the older and better-known method of the two, it’s not without its drawbacks. It doesn't rely on the power of cloud computing as much as ELT, takes more effort and resources to maintain and can be very difficult to scale.
By comparison, ELT is more streamlined and follows the key principle of a data pipeline — to process and organize your data as quickly, accurately and seamlessly as possible.

In other words, ELT means cost-effectiveness, flexibility and limitless collaboration.

Get Started with [Meltano ELT Today](/reference/command-line-interface#elt)
For in-depth information about ELT, ETL and which one is better for each use case, please visit our ['ETL vs ELT' blog](https://meltano.com/blog/etl-vs-elt-which-is-better/).

0 comments on commit 4115c4a

Please sign in to comment.