Skip to content

ETL batch pipeline to get files from SFTP to Cloud Storage, then ingesting to BigQuery.

Notifications You must be signed in to change notification settings

mvoliveira1010/SFTPToBigQuery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 

Repository files navigation

SFTPToBigQuery

ELT batch pipeline to get files from SFTP and upload them to Cloud Storage, then open from Cloud Storage and ingesting data to BigQuery.

1º step: fn-sftp-to-gcs.py

Description: This Python script is triggered by HTTP request, open a SSH and SFTP connection, get the file by file_name param, and write it to a blob on a Cloud Storage bucket).

2º step: fn-gcs-to-bqs.py

Description: This Python script is triggered by Cloud Storage blob creation action, open the blob by blob_name param, as a dataframe, then ingest dataframe to BigQuery table. Team document

Set up the DATA LIFECYCLE RULE to delete the blobs on Cloud Storage after x days.

About

ETL batch pipeline to get files from SFTP to Cloud Storage, then ingesting to BigQuery.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages