Skip to content
View VillePuuska's full-sized avatar
Block or Report

Block or report VillePuuska

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
VillePuuska/README.md

Ville Puuska

Experience

  • 2023- Data Engineer, Solita
  • 2017-2023 PhD student/researcher, Tampere University

Data engineering

Interests

  • Streaming data pipelines
  • Event driven architectures and data pipelines

Tech at work

  • Python, PySpark, Spark SQL, R when absolutely necessary
  • Azure Data Factory, Databricks

Tech at home

  • Python and a bit of Go
  • Airflow, Docker, DuckDB, Kafka, Polars, Postgres

What I'm trying/planning to learn

  • API development / FastAPI
  • K8s
  • Streaming and stream processing / Kafka & Flink

Mathematics

Research and Publications

My research is focused on the algebraic theory of topological data analysis. I'm interested in utilizing (minimal) resolutions to develop computable and interpretable representations and invariants for multiparameter persistent (co)homology and persistence modules more generally.

Education

  • 2017-2023, PhD, Mathematics, Tampere University
    Advisor: Professor Eero Hyry, Tampere University
    Field: Topological Data Analysis
    Thesis: Flat Covers and Cotorsion in Persistence https://urn.fi/URN:ISBN:978-952-03-3058-3
  • 2013-2017, MSc (and BSc), Mathematics, University of Tampere

Pinned Loading

  1. Journeys-pipeline-dlt-DuckDB-Polars Journeys-pipeline-dlt-DuckDB-Polars Public

    Simple example of an ELT pipeline using dlt for ingesting from the JourneysAPI, DuckDB for intermediate storage, and DuckDB & Polars for transformations.

    Python 1

  2. Streaming-and-processing-CPU-and-RAM-usage Streaming-and-processing-CPU-and-RAM-usage Public

    Python

  3. tkl-delays-app tkl-delays-app Public

    Python

  4. Message-queue Message-queue Public

    Go

  5. DuckDB-examples DuckDB-examples Public

    Basic tutorial and example scenario for using DuckDB

    Jupyter Notebook

  6. AoC AoC Public

    Advent of Code solutions

    Jupyter Notebook