Skip to content

Train a large language model on galaxy zoo public conversations

License

Notifications You must be signed in to change notification settings

astroinfo-hacks/2023-galaxy-zoo-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multimodal language models for GalaxyZoo image interpretation

Rationale

The rationale of this project is to leverage existing Large Multi-Modal Models (LMMs) to engage meaningfully with astronomical images. The overarching goal is to build a fine-tuned Language and Vision Model such as LlaVA on a curated dataset from the Galaxy Zoo project.

You can see examples of chat here:

https://www.zooniverse.org/projects/zookeeper/galaxy-zoo/talk/1270

image

The steps of the project are as follows:

  1. Explore the Galaxy Zoo Talk dataset
  2. Read and understand the high-level details of the LlaVA and Llava-Med papers.
  3. Summarise the text using a LLM using either open-source or proprietary models.
  4. Curate the image - summary pairs for the instruction-tuning.
  5. Fine-tune the model.
  6. Evaluate the model.

The architecture of the LlaVA model, where the pre-trained CLIP visual encoder ViT-L/14 is connected to the LLAMA decoder. image

You can watch the hack presentation by Jo during the telecon.

There is also a good video describing MLMs here: https://www.youtube.com/watch?v=mkI7EPD1vp8

Dataset

References

Here is a list of references to get started on the subject

LLM-specific resources:

About

Train a large language model on galaxy zoo public conversations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published