Instructor: Michael L. Nelson mln@cs.odu.edu
Office Hours: Mondays 2-4 and by appointment
Time: Mondays 4:20pm - 7:00pm
Place: online -- contact instructor for Zoom URL
Class Email list: https://groups.google.com/group/cs895-f22
After a review of the protocols and mechanics of web archiving and social media, this class will focus on using web archives to establish the veracity of information that we experience online. The class will stress student participation and presentation. Students will summarize the research work of others (chosen from a reading list), as well as propose their own forensics studies in areas of their own interest. Examples of some of the resources resulting from the prior offering of this course include:
-
Michael L. Nelson, Russell Westbrook, Shane Keisel, Fake Twitter Accounts, and Web Archives, 2019.
-
Nauman Siddique, TweetedAt: Finding Tweet Timestamps for Pre and Post Snowflake Tweet IDs, 2019.
-
Sawood Alam, Cookie Violations Cause Archived Twitter Pages to Simultaneously Replay in Multiple Languages, 2019.
Other forensics examples:
-
The Conservative Party Speeches and Why We Need Multiple Web Archives
-
Links to abovethelaw.com broken on the live web and blocked from the archive.
-
Dominic Cummings claiming he warned about a coronavirus in 2019
-
The interaction between search engine caches and web archives.
-
GOP candidate Marjorie Taylor Greene spread conspiracies about Charlottesville and 'Pizzagate'.
-
Right-Wing Media Outlets Duped by a Middle East Propaganda Campaign.
-
The Internet Archive Is Being Used As A Disinformation Mule.
-
Blake Masters' campaign scrubbed the abortion section of his policy page
-
What exactly did #KathyGriffin say to have her account @kathygriffin suspended?
-
A good summary of web archives and their use in disinformation studies: Using Web Archives in Disinformation Research
-
Week 1 - August 29 - The W3C Web Architecture, Memento Protocol, and Research Issues With Web Archiving
- Background: Memento 101, UTC, ISO 8601, robots.txt, The Missing Semester of Your CS Education
-
Week 2 - September 5 - Labor Day -- no class
-
Week 3 - September 12 - Continued: The W3C Web Architecture, Memento Protocol, and Research Issues With Web Archiving
-
Week 4 - September 19 - Blockchain Can Not Be Used To Verify Replayed Archived Web Pages, Russell Westbrook, Shane Keisel, Fake Twitter Accounts, and Web Archives
-
Week 5 - September 26 - Student Presentations 1
- David Calano - Jawa: Web Archival in the Era of JavaScript
- Tarannum Zaki - Understanding Web Archiving Services and Their (Mis)Use on Social Media
-
Week 6 - October 3 - Student Presentations 2
- Yasith Jayawardana - Fake News vs Satire: A Dataset and Analysis
- Lesley Frew - Visualizing changes to US federal environmental agency websites
-
Week 7 - October 10 - Fall Holiday -- no class
-
Week 8 - October 17 - Student Forensics Studies 1
- Bhanuka Mahanma - What makes fake images detectable? Understanding properties that generalize (Moved from Week 6 because of the storm)
- Skanda Dhanushkanda - Melting Pot of Origins: Compromising the Intermediary Web Services that Rehost Websites (Moved from Week 6 because of the storm)
- Lesley Frew - J.K. Rowling Transgender-Comments Controversy
-
Week 9 - October 24 - Student Presentations 3 / Student Forensics Studies 1 continued
- Emily Escamilla - Replaying Archived Twitter: When your bird is broken, will it bring you down?
- Yasith Jayawardana - Where Did the Web Archive Go?
- Bhanuka Mahanma - Sri Lankan Economic Crisis and the ex-Governor of Central Bank
-
Week 10 October 31 - Student Forensics Studies 2
- Tarannum - Rep. Marjorie Taylor Greene's controversial 4th of July fabricated tweet
- Emily Escamilla - The Tweets of Jerry Falwell Jr.
- David Calano - Retracing the Seige of Hong Kong Polytechnic University
-
Week 11 - November 7 - Student Presentations 4
- Lesley Frew - The Internet Archive and the socio-technical construction of historical facts
- David Calano - Reproducible Web Corpora: Interactive Archiving with Automatic Quality Assessment
- Yasith Jayawardana - Tweets of Andrew Tate
-
Week 12 - November 14 - Student Forensics Studies 3
-
Week 13 - November 21 - Student Presentations 5
- Emily Escamilla - Only One Out of Five Archived Web Pages Existed as Presented
- Bhanuka Mahanma - Diffusion of Lexical Change in Social Media
- Tarannum Zaki - On Twitter Purge: A Retrospective Analysis of Suspended Users
-
Week 14 - November 28 - Student Forensics Studies 4
- Skanda Dhanushkanda - How well are Sri Lankan sites archived?
- David Calano - Web Archival Survey of Global Data Portals
- Lesley Frew - Extending the EDGI US federal environmental agency websites study
-
Week 15 - December 5 - Student Forensics Studies 5
- Yasith Jayawardana - Nov 15, 2022 Missile Strike on Poland
- Skanda Dhanushkanda - Labeled tweets of Donald Trump
- Bhanuka Mahanma - Sri Lankan Inflation using Web Archives
- Tarannum Zaki - The Controversy with Dictionary Keywords on Social Media
-
Week 16 - December 12 - Exam Week
-
Election Integrity Partnership Team, Repeat Offenders: Voting Misinformation on Twitter in the 2020 United States Election, 2020.
-
Melanie Smith, Interpreting Social Qs: Implications of the Evolution of QAnon, 2020.
-
Mary Huber, Chinese Citizens Find Ways to Circumvent COVID-19 Censorship, 2020; Amelia Acker, Platforms, Community Archives and Remembering the Pandemic, 2020.
-
Donie O'Sullivan, How We Proved That the Biggest Black Lives Matter Page on Facebook Was Fake, 2020.
-
Kate Starbird, Carly Miller, Examining Twitter’s policy against election-related misinformation in action, 2020.
-
L Chai, D Bau, SN Lim, P Isola. What makes fake images detectable? Understanding properties that generalize, European Conference on Computer Vision, 2020.
-
Andy Greenberg, Hackers Broke Into Real News Sites to Plant Fake Stories, 2020.
-
Russell Brandom, Researchers uncover six-year Russian misinformation campaign across Facebook and Reddit, 2020.
-
Joan Donovan, Covid hoaxes are using a loophole to stay alive—even after content is deleted, 2020.
-
Joan Donovan, Protest misinformation is riding on the success of pandemic hoaxes, 2020.
-
Joan Donovan, Brian Friedberg, Source Hacking: Media Manipulation in Practice, 2019.
-
Renee DiResta, Isabella García-Camargo, Virality Project (US): Marketing meets Misinformation, 2020.
-
Caroline Orr, Pro-Trump & Russian-Linked Twitter Accounts Are Posing As Ex-Democrats In New Astroturfed Movement, 2018.
-
Takuya Watanabe, Eitaro Shioji, Mitsuaki Akiyama, and Tatsuya Mori, Melting Pot of Origins: Compromising the Intermediary Web Services that Rehost Websites, Proceedings of NDSS 2020.
-
Savvas Zannettou, Jeremy Blackburn, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, Understanding Web Archiving Services and Their (Mis)Use on Social Media, Proceedings of ICWSM 2018.
-
Jack Cushman, Ilya Kreymer, Thinking like a hacker: Security Considerations for High-Fidelity Web Archives, 2017; Ada Lerner, Tadayoshi Kohno, Franziska Roesner, Rewriting history: Changing the archived web from the present, Proceedings of the 2017 ACM SIGSAC, 2017.
-
Ahmer Arif, Leo Graiden Stewart, Kate Starbird, Acting the Part: Examining Information Operations Within# BlackLivesMatter Discourse, Proceedings of the ACM on Human-Computer Interaction - CSCW, 2018.
-
Louise Lief, What the news media can learn from librarians, Columbia Journalism Review, 2016.
-
Clifford Lynch, Stewardship in the 'Age of Algorithms' First Monday 22(12), 2017.
-
Ayush Goel, Jingyuan Zhu, Ravi Netravali, Harsha V. Madhyastha, Jawa: Web Archival in the Era of JavaScript, Proceedings of OSDI 2022, 2022.
-
Luca Luceri, Ashok Deb, Adam Badawy, Emilio Ferrara, Red Bots Do It Better: Comparative Analysis of Social Bot Partisan Behavior, Technical Report arXiv:1902.02765, 2019.
-
Xinyi Zhou, Reza Zafarani, Fake News: A Survey of Research, Detection Methods, and Opportunities, Technical Report arXiv:1812.00315, 2018.
-
Jacob Eisenstein, Brendan O'Connor, Noah A. Smith, Eric P. Xing, Diffusion of Lexical Change in Social Media, PLoS ONE 9(11), 2014.
-
Tom Wilson, Kaitlyn Zhou, Kate Starbird, Assembling Strategic Narratives: Information Operations as Collaborative Work within an Online Community, Proceedings of the ACM on Human-Computer Interaction - CSCW, 2018.
-
Renee DiResta, The Digital Maginot Line, Ribbonfarm, 2018.
-
Melanie Smith, Archives: Facebook Finds “Coordinated and Inauthentic Behavior” In the Philippines; Suspends a Set of Pro-Government Pages Ahead of May Elections, 2019.
-
Justin Littman, Vulnerabilities in the U.S. Digital Registry, Twitter, and the Internet Archive, 2017; Justin Littman, Suspended U.S. government Twitter accounts, 2017
-
Mohamed Aturban, Michele C. Weigle, Michael L. Nelson, Difficulties of Timestamping Archived Web Pages, Technical Report arXiv:1712.03140, 2017.
-
Scott G. Ainsworth, Michael L. Nelson, Herbert Van de Sompel, Only One Out of Five Archived Web Pages Existed as Presented, Proceedings of Hypertext 2015, 2015.
-
Jennifer Golbeck et al., Fake News vs Satire: A Dataset and Analysis, Proceedings of the 10th ACM Conference on Web Science, 2018.
-
Amelia Acker, Data Craft: The Manipulation of Social Media Metadata, 2018.
-
Max Read, How Much of the Internet Is Fake? Turns Out, a Lot of It, Actually, New York Magazine, 2018.
-
Michael L. Nelson, Why we need multiple web archives: the case of blog.reidreport.com, 2018.
-
Mohammed Nauman Siddique, "Grampa, what's a deleted tweet?", 2018; Ed Summers, Delete Forensics, 2017.
-
Melanie Ehrenkranz, How Archivists Could Stop Deepfakes From Rewriting History, 2018.
-
Kate Starbird, The Surprising Nuance Behind the Russian Troll Strategy, 2018.
-
Ed Summers, Blacktivists in the Archive, 2017.
-
Clifford Lynch, Managing the Cultural Record in the Information Warfare Era, EDUCAUSE Review 53(6), 2018.
-
Nicholas Confessore, Gabriel J.X. Dance, Rich Harris and Mark Hansen, The Follower Factory, The New York Times, January 27, 2018.
-
Sawood Alam, Plinio Vargas, Michele C. Weigle, Michael L. Nelson, Impact of HTTP Cookie Violations in Web Archives, Technical Report arXiv:1906.07141, 2019.
-
Amelia Acker, Mitch Chaiet, The weaponization of web archives: Data craft and COVID-19 publics, The Harvard Kennedy School (HKS) Misinformation Review, 2020.
-
Nattiya Kanhabua, Philipp Kemkes, Wolfgang Nejdl, Tu Ngoc Nguyen, Felipe Reis, Nam Khanh Tran, How to Search the Internet Archive Without Indexing It, Proceedings of TPDL 2016.
-
Hugo C. Huurdeman, Anat Ben-David, Jaap Kamps, Thaer Samar, and Arjen P. de Vries, Finding Pages on the Unarchived Web, Proceedings of JCDL 2016.
-
Anat Ben-David, Counter-archiving Facebook, European Journal of Communication, 35(3), 2020.
-
Anat Ben-David, Adam Amram, The Internet Archive and the socio-technical construction of historical facts, Internet Histories 2(1-2), 2018.
-
Anat Ben-David, 2014 not found: a cross-platform approach to retrospective web archiving, Internet Histories 3(3-4), 2019.
-
Anat Ben-David, What does the Web remember of its deleted past? An archival reconstruction of the former Yugoslav top-level domain, New Media & Society, 18(7), 2016.
-
Farhan Asif Chowdhury, Lawrence Allen, Mohammad Yousuf, Abdullah Mueen, On Twitter Purge: A Retrospective Analysis of Suspended Users, 4th International Workshop on Mining Actionable Insights from Social Networks, 2020.
-
Eric Nost ,Gretchen Gehrke,Grace Poudrier,Aaron Lemelin,Marcy Beck,Sara Wylie, Visualizing changes to US federal environmental agency websites, 2016–2020, PLOS One, 2021.
-
Mohamed Aturban, Michael L. Nelson, and Michele C. Weigle, Where Did the Web Archive Go?, Proceedings of Theory and Practice of Digital Libraries (TPDL), pp. 73-84, 2021.
-
Kritika Garg, Himarsha Jayanetti, Michele C. Weigle, and Michael L. Nelson, Replaying Archived Twitter: When your bird is broken, will it bring you down?, Proceedings of the 2021 ACM/IEEE Joint Conference on Digital Libraries.
-
Mohamed Aturban, Sawood Alam, Michael L. Nelson, Michele C. Weigle, Archive Assisted Archival Fixity Verification Framework, Proceedings of JCDL 2019, 2019.
-
Ayush Goel, Jingyuan Zhu, Ravi Netravali, Harsha V. Madhyastha, Jawa: Web Archival in the Era of JavaScript, Proceedings of OSDI 2022.
-
Brenda Reyes Ayala, Correspondence as the primary measure of quality for web archives: A human-centered grounded theory study, International Journal of Digital Libraries, 23, 2022, pp. 19-31.
-
Adam Kriesberg, Amelia Acker, The second US presidential social media transition: How private platforms impact the digital preservation of public records, JASIST, 2022.
-
Johannes Kiesel, Florian Kneist, Milad Alshomary, Benno Stein, Matthias Hagen, and Martin Potthast, Reproducible Web Corpora: Interactive Archiving with Automatic Quality Assessment, Journal of Data and Information Quality, 10(4), 2018. (see also: Let's compare memento damage measures!)
-
Himanshu Zade, Morgan Wack, Yuanrui Zhang, Kate Starbird, Ryan Calo, Jason Young, Jevin D West, Auditing Google’s Search Headlines as a Potential Gateway to Misleading Content: Evidence from the 2020 US Election, Journal of Online Trust and Safety, 2022.