Welcome to the wiki for the course Social graphs and interactions (02805) offered by the Technical University of Denmark. This is the main page, where you can access the weekly exercises. If you take a look in the side-bar, you can read about the administrative details (including a very useful course overview), assignments, books, and more.
The class is taught flipped classroom style, where the the lecture and homework elements of a course are reversed. You'll be able to view short video lectures before (or during) the class session, so in-class time can be devoted to exercises, projects, or discussions. Check out the Before week 1 lecture to learn more.
Before week 1. Take a look at this page before you do anything. This class most likely works a little bit differently from other classes you've taken. The notebook explains pretty much everything - the rest will be explained during the lectures. In case the link doesn't work, you can also see the file here on github, but the videos won't display properly
Week 1: Introduction. This week is all about getting started: Installing python, learning about Jupyter notebooks, and making sure that you're a relatively skilled python programmer. We also have some fun with the Twitter API. In case the link below doesn't work, you can also see the file here on github, but the videos won't display properly.
- Reading: Read through chapter 1 of Mining the Social Web, 2nd Edition (MTSW2e) by Matthew A. Russell. We will focus on content up to page 29, so you can skim the rest. The first chapter of the book (which is all we'll use this semester) is available for free. You can get it here.
Week 2: Networks I. It's time to learn about networks. I have to admit that I love networks. I could talk about them for hours. And that's actually also what I'll be doing for today's lecture. Lots of info from me + some reading for you guys. I'll answer some important questions, such as "Why would anyone care about networks" and "How can you use Python to study networks". In case the link above doesn't work, you can also see the file here on github.
- Reading: Reading. Chapter 1, 2, and 3 (section 3.1-3.7 ... the most important part is 3.1-3.4, so focus on that) of Network Science. You can find the entire book online here.
Week 2.1: Course page and Peer evaluations. I know you love working on the problems, growing, learning. But in order to get good grades, you also need to know how the class works. So check out this short notebook to get on top of the course homepage and our peer evaluation policy. In case the link above doesn't work, you can also see the file here on github
Week 3: Networks II. This week, we go deeper with networks. Now that you're on top of the basics, we'll look at the research for this week's topic. We'll focus on two key research results from the turn of the millenium that truly kick-started a revolution in our understanding of networks. Specifically, we will look at problems with random networks as models for real networks and how that leads to the Watts-Strogatz model. Then we will discuss scale-free networks and the Barabasi-Albert model. As always, in case the link above doesn't work, you can also see the file here on github.
- Reading: We will continue with Network Science. We start with the rest of chapter 3 with an emphasis on 3.8-3.9. Then we read Chapter 4, section 4.1 - 4.7 and Chapter 5, section 5.1 - 5.5.
Week 4: Networks III, Revenge of the Data Scientist. Today you will be getting your very own dataset from wikipedia. Working with real data is a pain in the a**, and today you will experience this fact first hand. But you should, of course, be thanking me for providing this experience, since this is what the real world feels like. After your life at DTU no one will be giving you a nice, cleaned dataset that you can easily load into your favorite data structure. So I hope this experience will be valuable for you as you move through life after DTU. And I promise that you will never fear raw data again. As always, in case the link above doesn't work, you can also see the file here on github.
- Reading: There's no reading today. You'll have enough to do with running regular expressions and other fun things.
Week 5: Networks IV, Advanced measures and communities. You've done a lot of work retrieving the philosopher network from wikipedia. Today the goal is to analyze it and learn something about both network science and philosophy along the way. As always, in case the link above doesn't work, you can also see the file here on github.
- Reading: This week, the reading is mostly for reference. It's for you to have a place to go, if you want more detailed information about the topics that I cover in the video lectures. Thus, I recommend you check out Chapter 9 of the network science book. In particular, we'll delve into section 9.4 in the exercises. We will also talk a little bit about degree correlations - you can read about those in Chapter 7.
Week 6: NLTK I, Getting started with NLTK. Ok. So we're changing gears. We've looked at the networks in Wikipedia. Now we'll put together the tools for working with the text. This first week is going to be a walk in the park (I still feel bad for making you download all those philosopher pagers) - so we'll just be installing the software, reading the book a bit, and solving some exercises. Easy-peasy. (Plus, if you're behind, today's light load makes it a nice day to catch up on everything else). As always, in case the link above doesn't work, you can also see the file here on github.
- Reading: Natural Language Processing with Python, first edition (NLPP1e) Chapter 1, Sections 1.1, 1.2, 1.3. (It's free online) and NLPP1e Chapter 2.1 - 2.4.
Week 7: NLTK II, A mixed bag of useful tricks. Now, let's get real and work with some language/text. I'm taking you through the dreaded chapter 3 of NLPP1e, talking about TF-IDF as a way to summarize what is important about a document, and we'll be getting into sentiment.
- Reading I: NLPP Chapter 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.9, and 3.10. It's not important that you go in depth with everything here - the key think is that you know that Chapter 3 of this book exists, and that it's a great place to return to if you're ever in need of an explanation of regular expressions, unicode, etc.
- Reading II: Check out the wikipedia page for TF-IDF.
- Reading III: Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter
Week 8: NLTK III, Networks and text, all together. This week is about putting network science and text analysis together to understand how humans navigate Wikipedia to find information. There is no reading, just data analysis!