This is a collection of talks given at various Meetups and conferences. All questions are welcome in the Slack channel.
2022
Fugue Core
Integrations
- Scaling PyCaret with Spark (or Dask) through Fugue (Towards Data Science)
- Fugue and DuckDB: Fast SQL Code in Python (Towards Data Science by Khuyen Tran)
2021
Fugue Core
- Fugue - Reducing Spark Developer Friction (James Le Blog)
- Creating Pandas and Spark Compatible Functions with Fugue (Towards Data Science)
Data Validation
- Using Pandera on Spark for Data Validation through Fugue (Towards Data Science)
FugueSQL
- Interoperable Python and SQL in Jupyter Notebooks (Towards Data Science)
- Data Analysis with FugueSQL on Coiled Dask Clusters (Coiled Blog)
- Introducing FugueSQL — SQL for Pandas, Spark, and Dask DataFrames (By Khuyen Tran on Towards Data Science)
2022
2021
Data Validation
- Large Scale Data Validation with Spark and Dask (PyCon US)
- Fully Utilizing Spark for Data Validation (Spark AI Summit)
- Large Scale Data Validation with Fugue (PyData Global)
FugueSQL
- Dask SQL Query Engines (Dask Summit)
- FugueSQL: Extending SQL Interface for End-to-End Data Pipelines (Dremio Subsurface)
- FugueSQL - The Enhanced SQL Interface for Pandas, Spark, and Dask DataFrames (PyData Global)
- Distributed Computing Workflows with Fugue-sql (Orlando Python Meetup)
Machine Learning
- Superworkflow of Graph Neural Networks with K8S and Fugue (Spark AI Summit)
- Scaling Machine Learning Workflows to Big Data with Fugue (KubeCon)
- Distributed ML to Learn Causal Effect Using Fugue and Spark (AI Camp)
Tune
- Intuitive and Scalable Hyperparameter Tuning with Apache Spark + Fugue (Spark AI Summit)
- Fugue Tune (PyData Global)
Testing Spark
- Simplifying Testing of Spark Applications (PyData Global)
- Simplifying Testing of Spark Applications (DataOps DC Meetup)
2020
- Unifying Spark and Non-Spark Ecosystems for Big Data Analytics (Spark AI Summit)