Those looking for automate repetitve Data Eng tasks with programming.
Very good to read Teach Yourself Programming in Ten Years by Peter Norvig
-
Data transform – in progress : Hesam & Farzaneh & Mehdi
- Pivot - Farzaneh
- Binning - Hesam & Mehdi
-
Data visualization and reporting - ?
- Make Excel, PDF attachment
- Add table to body of email
- Make plot as part of email HTML content
- Make HTML content . Rich content
- Visualization
. Plotly or Bokeh
-
SQL – to be finished
-
Read
-
Write
-
- SQL Bulk insert with Python
- Make abstract function with parameters to handle these
-
-
FTP – in progress : Mehdi
. Read
. Write
-
Log file generation
- What else?
-
- Multithread sample
- Subprocess sample
|Run|Year|Code address|
In this repository, we aim to provide sample for different tasks like those mention in below table.
different
Action 1 | Action 2 | |
---|---|---|
FTP | Read | Write |
SFTP | Read | Write |
SQL | ||
SQL Bulk Insert | ||
SQL Bulk Insert with Logging | ||
Read | Write | |
Pandas | Pipline for read & clean | |
sample data cleaning | ||
| |Pandas| Pipline for transform| |Pandas| Pipline for write|
import pysftp
with pysftp.Connection('hostname', username='me', password='secret') as sftp:
with sftp.cd('/allcode'): # temporarily chdir to allcode
sftp.put('/pycode/filename') # upload file to allcode/pycode on remote
sftp.get('remote_file') # get a remote file
Action 1 | Action 2 | |
---|---|---|
File | ||
Make | ||
delete | ||
Folder | Make | delete |