# Lab 1

**Goal**: The goal of this lab will be to get a Jupyter notebook running, build/visualize/measure some basic networks using XGI/NetworkX, and compute some basic statistics!

**Before we get started**:
1. Make a BIOL4559homework repository on GitHub. Add Abhay Gupta (@AbhayGupta115) as a collaborator in the Settings tab on GitHub.
2. In GitHub Desktop, make a branch titled `comp-activity-1`.
3. Put this notebook in the repository.
4. Create a conda environment titled `biol4559` by running `conda create -n biol4559 python=3.13`
5. Activate your conda environment by running `conda activate biol4559`
6. Download the requirements file from Canvas (in the `labs` folder under `Files`) and place it in the repository.
7. Install the list of requirements by running `pip install -r requirements.txt`
8. Make your environment visible to Jupyter Notebook by running `python -m ipykernel install --user --name biol4559 --display-name "Python (BIOL4559)"` in the command line.

**After the completion of this lab**
1. List the people with whom you worked on this lab.
2. Make a pull request and title it "Grade Computational Activity 1".
3. Assign @AbhayGupta115 as a reviewer.
4. Submit the pull request.

**How you will be graded**

This activity is out of 10 points. You will be graded on two things: completion, clear documentation, and correctness of your process.

* *Completion (4pts)*: This is solely whether you attempted and completed all portions of the computational assignment.
* *Documentation (4pts)*: Write comments above your code, describing what it does to demonstrate that you understand what the code is doing. ** Examples of good and bad documentation below.
* *Correctness (2pts)*: Does your code run and produce the correct output?

**Documentation examples**

Good example:
```python
# import necessary packages
import networkx as nx

# create an empty network
G = nx.Graph()

# add nodes 1, 2, and 3 to the network from a list.
G.add_nodes_from([1, 2, 3])

# add edges to the network from a list of the edges
G.add_edges_from([[1, 2], [2, 3]])

# Visualize the network
nx.draw(G)
```

Bad example:
```python
# Build and draw a network
import networkx as nx
G = nx.Graph()
G.add_nodes_from([1, 2, 3])
G.add_edges_from([[1, 2], [2, 3]])
nx.draw(G)
```

## NetworkX

Let's start by learning how to work with NetworkX! NetworkX is probably the most popular package for working with networks.

We import networkx:

We start by creating an empty network ( or *graph*):

Let's suppose that we want to add nodes and edges to this network. G is what is known as an *object* and we use the "." operator to access methods that work on graph objects. (I like to think of *objects* as nouns, and *methods* or *functions* as verbs)

We can use the `add_node` and `add_edge` methods to add nodes and edges to this network:

We can also add multiple nodes and edges at a time!

Let's visualize this network:

We can look at the NetworkX [documentation](https://networkx.org/documentation/stable/) to find other methods.

But before we get started, let's talk about `list` and `dict`.

What is a list? It is an array of numbers (For example, `l = [1, 2, 3]`), and you can (1) access entries inside by their *index* (For example, `l[0] = 1`) and (2) you can add entries to the end of a list with the `append` method (For example, after running `l.append(4)`, we get `l = [1, 2, 3, 4]`).

What is a dict? It is a combination of *key*, *value* pairs (For example, `d = {"node1": 1, "node2": 3, "node3": 7}`), and you can (1) access entries inside by their *key* (For example, `d["node1"] = 1`) and (2) you can add entries to the dictionary with an "=" (For example, after running `d["node4"] = 2`, we get `d = {"node1": 1, "node2": 3, "node3": 7, "node4": 4}`)

Lastly, what if you want to iterate over a list of things? The `for` loop is your friend! We can iterate over entries in a list:
```python
l = [1, 3, 4]
for i in l:
    print(i)
```
(which will print out `1, 3, 4`) or we can iterate over a pre-specified range:
```python

for i in range(4):
    print(i)
```
(which will print out `0, 1, 2, 3`).

**Graded activity**

Search the documentation and implement the following:

* Load the Zachary Karate Club network. How many nodes and edges does it have?
* Check whether the network is connected or not.
* Get the degree of each node. Who has the highest degree (and what is their degree)? Who has the lowest degree (and what is their degree)?
* Get the mean degree.
* Visualize the network. (*Ungraded:* If you want to be fancy, play around with node layouts!)

**Student short answer here**

## XGI

Okay, that's NetworkX. What about XGI? XGI stands for Comple**X** **G**roup **I**nteractions and can not only handle networks (or graphs), but also *hypergraphs* which are composed of group interactions. We won't get into the details quite yet, but for now, all you need to know is this: you can also build/represent/visualize networks with XGI.

We start by importing XGI:

XGI has much of the same syntax for adding nodes and edges. See below:

And adding multiple nodes/edges...

Now let's draw what we have!

XGI can color/size nodes by their statistics. Here I'm accessing the nodes of the network and getting their degree:

Everything that you need is on the XGI [website](https://xgi.readthedocs.io) for tutorials, documentation, and examples.

**Graded activity**

* Load the Zachary's Karate Club dataset in NetworkX and convert to a xgi.Hypergraph. Hint: you'll need to use `G.edges` and convert it to a list.
* Compute the clustering coefficient of each node. Who has the highest clustering coefficient (and what is it)? Who has the lowest clustering coefficient (and what is it)?
* What is the density of the network?
* Visualize the network with the nodes sized by their clustering coefficient.

**Student short response here**

Fun (**ungraded**) challenge problem:

Go to the [Budapest Reference Connectome 3.0](https://pitgroup.org/connectome/) and download the edge list. Import the data as a pandas (or polars) dataframe and use the first two columns as an edge list to import into NetworkX. Visualize this network. What are some features of this network that you notice?