Skip to content
Permalink
master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
# Talk details are specified in YAML files
# YAML was selected because we can use multi-line strings and add
# comments in the file.
speaker_name: "Caelyn McAulay and Holden Karau"
talk_title: "A Basic Introduction to PySpark Dataframes by exploring ASF Gender Diversity Data
"
# At least 1 tag is necessary!!
talk_tags:
- "tutorial"
- "pyspark"
- "data science"
- "2 hours"
talk_abstract: |
Apache Spark is a fast and general engine for big data processing.
Using PySpark, you can work with Spark DataFrames in Python.
The target audience is familiar with Python and looking to get their feet
wet with data science and/or the Spark framework. This tutorial will cover
reading in data from files and basic DataFrame operations.
While this session cannot provide enough background to support professional
work with Spark, we aim to provide some interesting initial tools and pointers
on how to go deeper for those interested.
# TODO: Add contents.
talk_details: |
Please note that this tutorial is 2 hours long and will possibly go into lunch.
To sign up for this tutorial, [follow this link](https://github.com/pyconca/2018-wiki/wiki/Tutorials/)
# Markdown is supported
about_author: ''
# web link will only show if about_author section is present
author_website: ''