Skip to content
Scott Veirs edited this page Nov 3, 2022 · 47 revisions

Welcome to the orcadata wiki!

This is a place to share and collaborate, especially regarding bioacoustic analysis of real-time and archived audio data related to the Orcasound open source project. Here you can learn more about Orcasound: machine learning resources related to orcas (training sets | test sets) and access to Orcasound data -- both archived training and testing data, and real-time audio streams. You may also be interested in the synopses of projects that leverage these open data at

Data Resources

Most-recent progress (within the last year)


  • Sep: Orcasound's GSoC 2022 contributors make final reports; DemocracyLab hackathon (9/10) connects to orcamap; Microsoft hackathon (9/20-22, Github Project) refines OrcaHello UI, model training/deployment/monitoring, notifications, begins annotations to SRKW pod and call type, and establishes first Kaggle for orca calls
  • Aug: HALLO workshop on open data for SRKW movement forecast modeling (Aug 31 - Sep 01); Orcasound applies for AWS Open Data sponsorship (2 years); planning for Microsoft and DemocracyLab hackathons in Sept.
  • Jul: First blog posts from Orcasound GSoC 2022 contributors regarding: open source approaches to de-noising and source separation; ingestion of OOI hydrophone data from Oregon; refinement of the Orca Active Learning tool code & deployment.
  • Jun: Orcasound Google Summer of Code (GSoC) 2022 students begin coding
  • May: At DCLDE 2022 workshop, Beam Reach extern Emily Vierling shares her Haro Humpback open data & dictionary project, including a humpback non-song vocalization dictionary based on recordings from Haro Strait, WA, and an annotated training data set for 12 humpback signal types.
  • Apr: Earth Day hackathon organizes Orcasound open data visualization opportunities; OrcaHello Azure subscription extended until Oct, 2022.
  • Mar: OrcaHello Dashboard reaches 3,500 annotated 1-min candidates; Orcasound and HALLO project present at the DCLDE workshop in Hawaii
  • Feb: Orcasound accepted as 2022 GSoC host organization (3rd year)
  • Jan: OrcaHello tag cloud curated using standardized dictionary of labels.


  • Dec: Orcasound presents at the Acoustical Society of America meeting in Seattle
  • Nov: SRKWs in Puget Sound, humpbacks in Haro! OrcaHello migrates to new Azure subscription; coordination with HALLO on ASA/DCLDE/SSEC talks; Orcasound extern Emily Vierling catalyzes humpback non-song vocalization label standardization.
  • Oct: Beluga in Puget Sound! OrcaHello team improves real-time inference system during annual hackathon (Oct 12-14), including re-training model, continuous integration, moderator UI enhancements, and documentation. MBARI publishes acoustic archive via AWS open data repository.

For more details, see the growing list of documentation pages for each Orcasound machine learning effort.

Deeper history of AI for Orcas project

Starting in the early 2000s, members of the Orcasound community have been contemplating the application of artificial intelligence to the problem of detecting orcas acoustically. Orcasound's AI for Orcas project page describes the evolution of our collective efforts. #ai4orcas