Data Engineer with Python

This repository contains the code for the Data Engineer with Python career track, by DataCamp.

About the course

In this track, you’ll discover how to build an effective data architecture, streamline data processing, and maintain large-scale data systems. In addition to working with Python, you’ll also grow your language skills as you work with Shell, SQL, and Scala, to create data engineering pipelines, automate common file system tasks, and build a high-performance database.

Through hands-on exercises, you’ll add cloud and big data tools such as AWS Boto, PySpark, Spark SQL, and MongoDB, to your data engineering toolkit to help you create and query databases, wrangle data, and configure schedules to run your pipelines. By the end of this track, you’ll have mastered the critical database, scripting, and process skills you need to progress your career.

The repository contains the code for the various courses in this track using Jupyter Notebooks :

1. Data Engineering for Everyone

2. Python Programming

Please note this track assumes a fundamental knowledge of Python and SQL.

The code samples have been tested using Python 3.8.11

Installation

If you have an existing Python 3.8.x installation on a Unix-like system, you can install the required Python libraries using pip:

sudo -H python3 -m pip install <Library>

In case you are running Python 3.8.x on Windows use:

py -3 -m pip install <Library>

Alternatively, you can install Anaconda Distribution for your particular platform anaconda.

Acknowledgments

DataCamp

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
01-data-engineering-for-everyone		01-data-engineering-for-everyone
02-python-programming-skill-assessment		02-python-programming-skill-assessment
13-introduction-to-pyspark		13-introduction-to-pyspark
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

01-data-engineering-for-everyone

01-data-engineering-for-everyone

02-python-programming-skill-assessment

02-python-programming-skill-assessment

13-introduction-to-pyspark

13-introduction-to-pyspark

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Data Engineer with Python

About the course

Installation

Acknowledgments

About

Releases

Packages

Languages

drjkuria/data-engineer-with-python

Folders and files

Latest commit

History

Repository files navigation

Data Engineer with Python

About the course

Installation

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Languages