PANDAS

This is a series of ipython notebooks for analyzing Big Data -- specifically Twitter data -- using Python's powerful PANDAS (Python Data Analysis) library.

For these tutorials I am assuming you have already downloaded some data and are now ready to begin examining it. In the first notebook I will show you how to set up your ipython working environment and import the Twitter data we have downloaded. If you are new to Python, you may wish to go through a series of tutorials I have created in order.

If you want to skip the data download and just use the sample data, but don't yet have Python set up on your computer, you may wish to go through the tutorial "Setting up Your Computer to Use My Python Code".

Also note that we are using the iPython notebook interactive computing framework for running the code in this tutorial. If you're unfamiliar with this see this tutorial "Four Ways to Run your Code".

For a more general set of PANDAS notebook tutorials, I'd recommend this cookbook by Julia Evans. I also have a growing list of "recipes" that contains frequently used PANDAS commands.

##Prerequisites As you may know from my other tutorials, I am a big fan of the free Anaconda version of Python 2.7. It contains all of the prerequisites you need and will save you a lot of headaches getting your system set up. Once it's all installed open up a terminal and run the following:

git clone https://github.com/gdsaxton/PANDAS.git
cd PANDAS
ipython notebook

A tab containing links to all of the available chapters will open up in your browser at http://localhost:8888

Sample data for use in Chapter 1 can be found in the data folder.

I hope you find these tutorials helpful; please acknowledge the source in your own research papers if you’ve found them useful:

Saxton, Gregory D. (2015). Analyzing Big Data with Python. Buffalo, NY: http://social-metrics.org

Also, please share and spread the word to help build a vibrant community of PANDAS users.

Happy coding!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PANDAS

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
data		data
Chapter 1 - Import Data, Select Cases and Variables, Save DataFrame.ipynb		Chapter 1 - Import Data, Select Cases and Variables, Save DataFrame.ipynb
Chapter 2 - Aggregating and Analyzing Data by Twitter Account.ipynb		Chapter 2 - Aggregating and Analyzing Data by Twitter Account.ipynb
Chapter 3 - Analyzing Twitter Data by Time Period.ipynb		Chapter 3 - Analyzing Twitter Data by Time Period.ipynb
Chapter 4 - Analyzing Hashtags.ipynb		Chapter 4 - Analyzing Hashtags.ipynb
Chapter 5 - Generating New Variables.ipynb		Chapter 5 - Generating New Variables.ipynb
Chapter 6 - Producing a Summary Statistics Table for Publication.ipynb		Chapter 6 - Producing a Summary Statistics Table for Publication.ipynb
Chapter 7 - Analyzing Audience Reaction on Twitter.ipynb		Chapter 7 - Analyzing Audience Reaction on Twitter.ipynb
Chapter 8 - Running, Interpreting, and Outputting Logistic Regression.ipynb		Chapter 8 - Running, Interpreting, and Outputting Logistic Regression.ipynb
Charity Navigator (10) - Read in Electronic 990 Data to MongoDB.ipynb		Charity Navigator (10) - Read in Electronic 990 Data to MongoDB.ipynb
Charity Navigator (10b) - Read Electronic 990 Data FROM MongoDB (create additional 'SOX' variables for robustness tests for orgs with 2016 donor advisories).ipynb		Charity Navigator (10b) - Read Electronic 990 Data FROM MongoDB (create additional 'SOX' variables for robustness tests for orgs with 2016 donor advisories).ipynb
Charity Navigator (11) - Set up and Run Logits.ipynb		Charity Navigator (11) - Set up and Run Logits.ipynb
Charity Navigator (9) - Generate Variables for Logits (with old code for groupby-agg-'first')-checkpoint.ipynb		Charity Navigator (9) - Generate Variables for Logits (with old code for groupby-agg-'first')-checkpoint.ipynb
Charity Navigator (9) - Generate Variables for Logits-Copy1.ipynb		Charity Navigator (9) - Generate Variables for Logits-Copy1.ipynb
Charity Navigator (9b) - Merge and Variable Creation for Additional Robustness Tests.ipynb		Charity Navigator (9b) - Merge and Variable Creation for Additional Robustness Tests.ipynb
Charity Navigator R&R (5) - Generate governance variables from e-file data and then merge into test 4 data.ipynb		Charity Navigator R&R (5) - Generate governance variables from e-file data and then merge into test 4 data.ipynb
README.md		README.md

gdsaxton/PANDAS

Folders and files

Latest commit

History

Repository files navigation

PANDAS

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages