Skip to content

uniphil/sfpc-py101

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

Hello! 👻

Today we're going to talk a bit about text scraping, manipulation, and analysis in Python.

Workshop by Phil, Riley, and Yeli.

Tools

If you don't have a favorite text editor already, download Sublime Text. You can use Xcode for these exercises if you're used to it, but we recommend Sublime since it's simpler and less clunky.

Open up a terminal and run this command to download the installer:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py

then to install,

sudo python get-pip.py

Once we've got pip set up, we can install Beautiful Soup with

$ sudo pip install beautifulsoup4

Beautiful Soup helps us scrape text from the internet. Muahahaha! 👹

Once we've got pip set up, we can install NLTK with

$ sudo pip install nltk

NLTK is a suite of text processing libraries for Python that lets us analyze text in some really interesting and powerful ways. For the intro exercises, we'll work through part of the NLTK Book. It's a great resource, check it out!!

NLTK comes loaded with a bunch of corpora and trained models. We're going to use some of them, so in your Python REPL type:

import nltk
nltk.download()

If it looks like nothing happened, check if a new window popped open in the background. We want to download book under the "Collections" tab.

Cool links

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages