# Labs

> Shenandoah DATA-401 Intro to NLP Spring 2026


The course website is located [here](https://www.notion.so/Intro-to-Natural-Language-Processing-28b213a83886806982a5c03b425595c4?source=copy_link). Lecture materials, assignments, quizzes, etc. can be accessed at that link.

This site contains jupyter notebooks, data and other code artifacts associated with this course. I recommend you run these notebooks in Google Colab since they are tested in that environment. However, you are free to download and run elsewhere.

### Choosing a Notebook Environment

#### Google Colab - single notebook experience

If you prefer:
- working within a single notebook
- are already comfortable with Google Colab
- want instant access to GPUs

you may prefer Google Colab. Some libraries will need to be re-installed after each re-start.

#### Deepnote
If you are comfortable with working from within a forked copy of the repository itself and:
- want automatic install of dependencies
- low friction to move between notebooks
- realtime collaboration

you may prefer Deepnote. You will need to create a free account and then request an education plan.

#### Local JupyterLab / Notebook

If you are already comfortable in Jupyter in your local environment and:
- you want full control of your machine and environment
- persistence
- and don't mind dealing with managing your environment

you may prefer local Jupyter.

## Installation

### For Students (Google Colab)

For Google Colab (Python 3.10 compatible), use the special Colab requirements file:

```python
# Download and install Colab-specific requirements
!wget -q https://raw.githubusercontent.com/su-dataAI/data401-nlp/main/colab-requirements.txt
!pip install -q -r colab-requirements.txt

# The spaCy model will be automatically downloaded when needed
from data401_nlp.helpers.spacy import ensure_spacy_model
nlp = ensure_spacy_model("en_core_web_sm")
```

**Note:** The `colab-requirements.txt` file pins numpy to version 1.x for Python 3.10 compatibility with Colab's pre-installed IPython 7.34.0.

### For Deepnote

For Deepnote:

```python
!pip install -r deepnote-requirements.txt
```


### For Local Development

If you want to run the notebooks locally:

```bash
# Clone the repository
git clone https://github.com/su-dataAI/data401-nlp.git
cd data401-nlp

# Install with all dependencies
pip install -e ".[dev,all]"
pip install requirements.txt

# Download spaCy model
python -m spacy download en_core_web_sm

# Start Jupyter Lab
jupyter lab
```

### Installation Options

The package supports flexible installation based on your needs:

```bash
# Minimal installation (core utilities only)
pip install data401-nlp

# With NLP tools (spaCy, NLTK)
pip install data401-nlp[nlp]

# With transformers and PyTorch
pip install data401-nlp[transformers]

# With API support (FastAPI, Pydantic)
pip install data401-nlp[api]

# Everything (recommended for students)
pip install data401-nlp[all]
```

### Platform Support

✅ Google Colab  
✅ Deepnote  
✅ Jupyter Lab  
✅ Local Python 3.11+

### Helper Modules

The package includes several helper modules to make your NLP work easier:

- `data401_nlp.helpers.env` - Environment detection and API key loading
- `data401_nlp.helpers.spacy` - Automatic spaCy model management
- `data401_nlp.helpers.submit` - Assignment submission utilities
- `data401_nlp.helpers.llm` - LLM integration helpers

The helper libraries may be updated as the course proceeds.

## Contents

| Lab | Colab | Deepnote | GitHub | 
| ---- | ---- | ------ | ------ |
| Intro (Jan 15) | [![Open In Colab](https://img.shields.io/badge/Open%20in%20Colab-blue?logo=google-colab&style=flat-square)](https://colab.research.google.com/github/su-dataAI/data401-nlp/blob/main/nbs/01-intro.ipynb) | [![Open in Deepnote](https://img.shields.io/badge/Open%20in%20Deepnote-1f6feb?logo=deepnote&style=flat-square)](https://deepnote.com/launch?url=https://github.com/su-dataAI/data401-nlp/blob/main/nbs/01-intro.ipynb) | [![Open In GitHub](https://img.shields.io/badge/Open%20in%20GitHub-gray?logo=github&style=flat-square)](https://github.com/su-dataAI/data401-nlp/blob/main/nbs/01-intro.ipynb) |