Skip to content

Latest commit

 

History

History
73 lines (38 loc) · 6.09 KB

README.md

File metadata and controls

73 lines (38 loc) · 6.09 KB

marinelives-collaboratory


Depository for MarineLives Collaboratory

To accompany Colin Greenstreet's November 19th 2024 talk at the IHR Digital History seminar on Machine Learning and Historical Research, MarineLives is launching an online collaboratory. We welcome non-coding and very-low-coding historians with open arms.

The collaboratory is targeted at masters and doctoral students working in the field of history, who would like to explore how large language model based machine learning techniques can be incorporated into their research design and research processes. Each session will be built around specific Historical Research Use Cases. Participants are invited to bring their own use cases for discussion.

We will look at a wide range of analytical frameworks, subject matter and periods, directed by the research interests of participants in the collaboratory, and will explore how large language models can assist with the design and conduct of different types of research.

The goals of the collaboratory are to:

  1. Build useful research tools for real historical research use cases which can be put into immediate effect
  2. Develop and document shared knowledge of large language models applied to historical research use cases
  3. Build a community of doctoral students interested in large language model applications to historical research

The collaboratory draws on Colin Greenstreet's work in digital history, natural language processing, machine transcription, and large language models, as co-founder in 2012 of the volunteer led MarineLives digital history project. Joining us as a US based academic advisor, with a strong interest in machine transcription and digital techniques in the context of Brazilian and Atlantic world history, is Dr Thiago Krause, associate professor at Wayne State University, Michigan. Joining us as a European based academic advisor, with a strong interest in digital techniques in the context of C18th North American history, is Dr Mark L.Thomnpson, senior lecturer of American Studies at the University of Groningen.

Participants who have registered to take part in the collaboratory include graduate students and faculty from Antwerp, Cambridge, Groningen, Harvard, Leuven, London, Oldenburg, Wayne State, Yale and York.

We have an active wiki, with content frequently added.


Weekly sessions on ZOOM: Tuesdays @ 4 pm UK time; 5 pm Paris, Berlin, Madrid; 11 am EST || Weekly office ZOOM drop-in to discuss student use cases: Thursdays @ 4 pm UK time; 5 pm Paris, Berlin, Madrid; 11 am EST


The collaboratory will meet once a week online for 60 minutes starting in the week of November 25th 2024

Proposed topics for November and December 2024:

Proposed topics for January 2025:

Possible future topics: Archival APIs; AI-enhanced Dublin Core compliant metadata; Archival workflow; Building history strategy games; Creating fine-tuning datasets; Distance reading; Historical simulations; Hugging Face; Knowledge graphs; Integration with Semantic Scholar; Linked Open Data; Organising personal archives; SQL; Raw HTR text correction, modernization and summarization; Visualization; Scholarly editing in the world of LLMs


Collaboratory resources:

Anthropic - summarization

Github - marinelives-collaboratory

Google's Colab

Google's NotebookLM

Hugging Face - MarineLives Organisation

Pinecone - vectorbases

zotero-marinelives-collaboratory