Teko is looking for a highly motivated hands-on Data engineer. This position is responsible for designing and implementing cluster computing data systems. You'll be collaborating closely with our engineers and must be able to work efficiently across various teams.
- Collaborate with multiple teams and build Spark and Storage clusters.
- Build automation to install, manage & operate Spark, Redis, Cassandra, ELK, Prometheus, Minio, Kafka.
- Manage, backup and monitor Storage services and filesystems like S3, Cassandra and Redis Tables, and object stores for data analytics
- Understand, debug and patch big data files and objects in multiple formats (Parquet, HDF, Arrow)
- Build, maintain and deploy data science frameworks
- Automate ETL pipelines and ML workflows
- BS in Computer Science or similar field of study
- Very solid experience of Big Data Tools and Software amd Libraries
- Extended experience of software development (Python, Scala)
- Extended experience in Jupyter ecosystem
- Experience running SQL in different flavours (Spark, Cassandra, RDBMS)
- Experience monitoring and tuning big data applications and services
- Experience with distributed computing (Spark, Dask, TF)
- Experience with Spark
- Experience with Tensorflow
- Experience with Concourse is a plus
- Experience with Agile and Scrum
Download the dataset of taxi trips in New York https://www.kaggle.com/kentonnlp/2014-new-york-city-taxi-trips Build a data pipeline for analytics, produce the results in ELK. Bonus: Automate the pipeline with Concourse.
Provide the repo (github, gitlab, etc) of your solution to the hiring test, attach your latest CV to your email and send it to email@example.com hashtag #datateam in the subject line