Skip to content
@UnbubbleHub

Unbubble Hub - Open Research

A space for researchers and engineers to come together and collaborate in developing tools to fight social polarization

Unbubble Hub - Open Research

The aim of this initiative is to provide a space for researcher and engineer to come together and collaborate in developing tools to fight social polarization. We want to help people form their own judgment not by promoting "correct" opinions, but by fostering meaningful, conscious disagreement, encouraging a plurality of perspectives.

In order to do so, we believe that information integrity tools should be open, transparent, and accessible to everyone.

Tip

See what we're up to and join us in the Discussions tab!

Motivation

We are witnessing a collision between two powerful forces shaping public discourse: social media platforms and governments.

Social media platforms have become central infrastructure for public discourse. Their recommendation algorithms, designed to maximize engagement, have demonstrably contributed to polarization and the spread of misinformation. The underlying mechanism is well-documented: algorithms that prioritize engagement metrics (likes, shares, comments) systematically favor content that triggers strong emotional reactions—including outrage, fear, and hostility. This is not a bug but a feature of advertising-driven business models.

The platforms themselves are opaque. Users cannot see why certain content appears in their feeds. Researchers struggle to study algorithmic effects without platform cooperation. Civil society cannot audit systems that shape billions of people's understanding of the world.

Now, governments are responding. Australia became the first country to enforce a nationwide ban on under-16s accessing social media (December 2025). Spain has announced measures to criminalize "manipulating algorithms to amplify illegal content" and systems to track "how digital platforms fuel division and amplify hate." France, Denmark, the UK, and others are considering comparable laws. The EU's Digital Services Act already mandates algorithmic transparency for large platforms.

These regulations respond to real harms, but the regulatory response raises its own questions. Will the governments introduce new black boxes, this time state-controlled rather than corporate-controlled? Who will create the systems to track "how platforms fuel division”?

We believe that having open, transparent technology is a big part of the solution. The technology choices made by the governments in the next few years will shape information infrastructure for decades. If these systems are built as proprietary black boxes, we will have traded one form of opacity for another. If they are built on open foundations, civil society will have tools to audit, critique, and improve them, regardless of who deploys them.

Topics of interest

What follows are just some of the topics we might be interested about. Any project that goes in the direction of fighting social polarization and improve information integrity is welcomed.

Event coverage research

Collecting all the material available online covering an event. Understand the relation that each source has to the event (for example, a testimony from a eye witness vs. an official statement from one of the actors vs. a comment from an expert) and between different sources (for example which source quotes which other source) to build a complete and structured informative picture.

Sources evaluation

When reading an article, how much can you trust what you read based on the author and the publisher hosting it? What are the most reliable elements that predicts the trustfulness of an article? What are the limits of this approach?

Narrative identification

When reading an article, can we identify framing patterns, rhetorical techniques, and narrative structures? What are the most important elements we can look for, in order to understand the narrative that the article assumes?

Information extraction

When reading an article, can we extract claims, facts, and key information from unstructured text and multimedia? Can we filter out opinions and ideological framing? What are the best schema or theories that capture and give structure to unstructured media?

Fact and data tracking across sources

When comparing different sources, how can we tell exactly where they agree and where they disagree? Can we weight in the degree of certainty (a rumor vs. an official statement) and the time sequence of reports (recent vs. old information)?

Multimodal analysis

How can we extend our information technology beyond text, including images, videos and hybrid formats?

AI-Human feedback

How can we integrate AI-generated feedback with human feedback? How can we prevent the human feedback loop from being brigaded by ideological groups? How do we reconcile conflicts?

Sane ranking algorithms

Most current algorithms optimize for engagement, which favours outrage. Can we develop ranking algorithms that prioritize content and topics that are relevant and interesting but not polarizing? How do we balance personalization with exposure to diverse viewpoints?

Cognitive Interface Design

How do visual cues, timing, and phrasing affect the reception of corrective information? What UI patterns encourage "slow thinking" over "fast thinking"? What UX is best do help users do not feel defensive towards point of views they might not agree with?

Genealogy and Provenance Tracking

Can we trace the mutation of a narrative backward through time and across platforms to identify the point where context was stripped and the narrative was coined?

Open Datasets

The creation of open, high-quality, privacy-preserving datasets of labeled misinformation, narrative structures, and propaganda techniques.

Adversarial Testing

Proactively using our own tools to find weaknesses. How can these tools be weaponized, and how do we build defenses against our own technology?

Algorithmic Transparency Standards

Governments are beginning to mandate algorithmic transparency, but what "transparency" means in practice remains poorly defined. What information about an algorithm is necessary and sufficient for meaningful public accountability? How can we design systems that are "transparent by design", where the logic can be inspected without compromising user privacy or enabling gaming? What documentation standards (model cards, datasheets, audit logs) should become baseline requirements for any algorithm that shapes public discourse? How can non-experts meaningfully evaluate algorithmic behavior?

Kind of contributions

Research

We welcome contributions that advance our understanding of polarization, information integrity, and the systems that shape public discourse. This includes:

  • Literature synthesis: Mapping what is already known about a topic, identifying gaps, and making academic research accessible to practitioners
  • Theoretical development: Proposing frameworks, taxonomies, or models that help structure complex problems
  • Empirical investigation: Designing and conducting quantitative or qualitative studies that test hypotheses or reveal patterns in data
  • Critical analysis: Evaluating existing tools, platforms, or policies for effectiveness, risks, and unintended consequences

Research contributions can take many forms: academic papers, documented analyses in a repository, or interactive websites with live demonstrations. We value rigor and intellectual honesty over credentials: what matters is the quality of the work and the willingness to engage with critique. Contributors with academic backgrounds and citizen researchers are equally welcome; we may be able to connect contributors with established research institutions when relevant.

Engineering

We build open-source tools that turn research insights into practical, reusable technology. Contributions include:

  • Libraries and packages: Implementations of algorithms for text analysis, narrative detection, source evaluation, and related tasks, primarily as npm packages for the Node/JavaScript ecosystem
  • APIs and services: Hosted endpoints that make functionality accessible without requiring local setup
  • Live demos: Web interfaces that showcase tools in action and let users experiment directly

We prioritize code that is well-documented, tested, and designed for reuse. Cross-contribution is encouraged: engineers often surface new research questions, and researchers often identify needs that demand new tooling.

Licensing

Yes, it is really open.

Code licensing: AGPL v3

  • GNU Affero General Public License
  • If someone builds a service using a modified version of the code, they must share their modifications
  • If you benefit, you contribute back

Content licensing: CC BY 4.0

Code of Conduct and Governance

Each project has its maintainer who is responsible for the project.

@carlomartinucci is the Benevolent Dictator in case controversies arise.

Every contribution is welcomed, as long as it is respectful.

Tip

See what we're up to and join us in the Discussions tab!

Popular repositories Loading

  1. sources sources Public

    A module that takes a news event as input and returns sources, categorized and ranked, representing a range of diverse viewpoints

    Python 5 1

  2. .github .github Public

    Unbubble Hub - Open Research Initiative

  3. gdelt-pulse gdelt-pulse Public

    A pipeline that pulls GDELT's 15-minute data updates, extracts sources and annotations, and builds a searchable database of global news coverage. Designed for media bias and perspective research.

Repositories

Showing 3 of 3 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…