Skip to content

Releases: toolforge/tool-spacemedia

spacemedia-0.4.1

10 May 19:23
Compare
Choose a tag to compare

New sources

  • NASA/ESA James Webb Space Telescope: https://webbtelescope.org & https://esawebb.org
  • NASA ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer instrument abord Terra satellite): https://asterweb.jpl.nasa.gov
  • NASA JPL Photojournal: https://photojournal.jpl.nasa.gov
  • NASA SDO (Solar Dynamics Observatory): https://sdo.gsfc.nasa.gov
  • NOIRLab: https://noirlab.edu
  • Individuals on Flickr: Judy Schmidt, Kevin Gill, Pierre Markuse, Andrea Luck, Harry Stranger
  • New US military units; merged all of them in a single "US Space Force/Command" source
  • Enable automatic upload for:
    • US military (DVIDS & Flickr), DLR, KARI, James Webb Space Telescope (ESA & NASA): all files
    • NASA, IAU, ESO, NoirLab, Hubble (ESA & NASA), Individuals : only files published after 2022 for now
  • Enable manual upload of other sources

Major features

  • Tweet bot activity on Mastodon and Twitter
  • Add SDC (Structured Data on Commons) for uploaded files
  • Start a complete calculation of perceptual hashes on the whole Commons database to detect duplicates (still ongoing as of May 2023...)
  • Report exact duplicate files to Commons administrators by parsing Special:ListDuplicatedFiles (up to 190 files max to avoid spamming them too much)
  • Translate non-English text using Google Translate
  • Support WebP images files
  • Blocklist of terms implying uninteresting content published by US military and NASA
  • New remote capabilities to compute hashes on distant computers more powerful than toolforge pods
  • Switch spring schedules to toolforge jobs framework and Cloud VPS cronjobs
  • Support extracting information from Wikidata (ISS crew members, astronomical objects, telescopes, instruments...)
  • Detect and ignore courtesy photos in media published by US military and NASA (from ULA, SpaceX, Lockheed Martin...) using a blocklist of terms in media description and a blocklist of photographers in EXIF metadata
  • Look up for (NASA) images on Commons by their id to avoid upload not-exact duplicates and upload high-resolution version if needed

Minor features

  • Allow to manually refresh a media by reassessing all its metadata
  • Display a video/audio icons above preview images
  • New REST endpoint to return commons last timestamp
  • New REST enpoint to put a new hash association
  • NASA: Extract metadata for ISS and Artemis images
  • Initial support of upload in chunks for very large files exceeding memory on Wikimedia servers. Does not work for now :(

Behind the scene

  • Update to Java 17 and Spring Boot 2.7
  • Update to latest versions of Mediawiki
  • Update to MariaDB 10.4: https://phabricator.wikimedia.org/T301949
  • Update to breaking changes on hubblesite.org website
  • Update to breaking change on DVIDS CDN
  • Migrate from Phabricator Diffusion to Wikimedia GitLab
  • Migrate from eqiad.wmflabs to wikimedia.cloud
  • Hubble and Webb NASA websites handled as a single "STScI" repository
  • IAU, ESO, NoirLab websites handled as "Djangoplicity" repositories
  • Use JPEG plugin from twelve-monkeys in order to read more files
  • Disabled video support on toolforge, requires too much memory
  • As usual, lots of general performance/reliability improvements and dependencies upgrades

Full Changelog: spacemedia-0.4.0...spacemedia-0.4.1

v0.4.0

10 May 18:26
Compare
Choose a tag to compare

Major features

  • Automatic upload of ESA media

Minor features

  • General reliability/performance improvements

Behind the scene

  • Handling URL change from wmflabs.org to toolforge.org

Full Changelog: spacemedia-0.3.0...spacemedia-0.4.0

v0.3.0

24 May 22:45
Compare
Choose a tag to compare

First version performing automatic upload

Features completed

Major features

  • Detect of almost-duplicate images using percerptual hashes
  • No longer consider Flickr Public Domain Mark eligible for upload, unless clear U.S Public Domain
  • Replace Flickr image title consisting of VIRIN identifiers by their album name, if any
  • Enforce minimal delay between successive uploads to meet Wikimedia bot requirements
  • Upload mode for each agency: disabled (default), manual, automatic
  • Handle change of license for Flickr and YouTube media
  • Add DVIDS support for U.S Space Command, Space Force, Air Force Space Command, Space and Missile Systems Center
  • Enable automatic upload for images of U.S Space Command and Air Force Space Command
  • Add YouTube support for Arianespace
  • Download of YouTube videos using youtube-dl
  • Retrieval of already upload YouTube videos using their id
  • Startup page improvements: AJAX loading of statistics, sortable table

Minor features

  • Number of working threads configurable
  • Nice error pages for HTTP 404, 500, 501 errors

Behind the scene

  • Allow to reset problems, duplicates, ignored media, perceptual hashes
  • Mirroring from Phabricator to GitHub
  • Dependencies upgrade: Spring Boot 2.2.7, Scribe 6.9.0, Flickr4Java 3.0.4, bootstrap 4.5.0
  • Retrieval of web dependencies through Webjars
  • Store last update date/time for each media
  • Store runtime information (last update start time, end time, duration) for each agency
  • Store date/time for each problem
  • Rework persistence of metadata information
  • Lots of bugfixes, performance improvements, reduced memory usage

Features started

  • GitHub actions
  • SonarCloud analysis
  • Code cleanup, test coverage, documentation