<a href="https://colab.research.google.com/github/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/student/W1D1_Tutorial1.ipynb" target="_blank"><img alt="Open In Colab" src="https://colab.research.google.com/assets/colab-badge.svg"/></a>   <a href="https://kaggle.com/kernels/welcome?src=https://raw.githubusercontent.com/neuromatch/nasa-open-science/main/tutorials/W1D1_TheEthosOfOpenScience/student/W1D1_Tutorial1.ipynb" target="_blank"><img alt="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"/></a>

# Day 1: The Ethos of Open Science

**By Neuromatch Academy & NASA**

__Content creators:__ 

__Content reviewers:__ Leanna Kalinowski, Hlib Solodzhuk

__Production editors:__ Hlib Solodzhuk, Konstantine Tsafatinos, Ella Batty, Spiros Chavlis

___

## Tutorial Objectives

*Estimated timing of tutorial: 2 hours*

An introduction to open science, which is the principle and practice of making research products and processes available to all, while respecting diverse cultures, maintaining security and privacy, and fostering collaborations, reproducibility, and equity. In this module, you will take a closer look at what open science is, including the current landscape as well as the benefits and challenges. You then get a glimpse into the practice of open science, including case studies and examples. Lastly, you are presented with actions that you can take starting today, such as exploring communities that they can engage with.

In [None]:
# @markdown
from IPython.display import IFrame
from ipywidgets import widgets
out = widgets.Output()
with out:
    print(f"If you want to download the slides: https://osf.io/download//")
    display(IFrame(src=f"https://mfr.ca-1.osf.io/render?url=https://osf.io//?direct%26mode=render%26action=download%26mode=render", width=730, height=410))
display(out)

---
## Setup



###  Install and import feedback gadget


In [None]:
# @title Install and import feedback gadget

!pip install vibecheck --quiet

from vibecheck import DatatopsContentReviewContainer
def content_review(notebook_section: str):
    return DatatopsContentReviewContainer(
        "",  # No text prompt
        notebook_section,
        {
            "url": "https://pmyvdlilci.execute-api.us-east-1.amazonaws.com/klab",
            "name": "...", #TODO: add name
            "user_key": "...",
        },
    ).render()


feedback_prefix = "W1D1"

---

## Section 1: What is Open Science?

In this section, you take a closer look at what open science means, including the intended goals and outcomes of adopting open science as an individual and as part of a larger community. You then review examples of open science in action. Finally, you wrap up the section by taking a closer look at why adopting open science is needed.

### Ethos of Open Science

Let's begin by explaining the word "ethos".

> "Ethos is the distinguishing character, sentiment, moral nature, or guiding beliefs of a person, group."

**Merriam-Webster**


Note that "ethos" is not exactly "ethics", but offers a broad enough term to include the **moral attitudes** held by the individuals or institutions who practice open science. To clarify the moral element to this discussion, we speak of **"responsible open science"** going forward. Throughout this tutorial, we have integrated ethics around open science that dictate how you share, give due credit, and work together. **"Practice the Golden Rule"** - treat others the way you would like to be treated in their situation.

####  Submit your feedback


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Ethos_of_Open_Science")

### Open Science at NASA

NASA funds some of the most diverse research of any federal agency and has a history of sharing research and results going back to the **Apollo Program** in the 1960s. NASA's **Transform to Open Science** program emphasizes sharing guidelines and best practices that apply to its diverse research efforts, cultivating a culture of openness. NASA's commitment to open science enhances **collaboration** across various research domains, from astrobiology to physics, allowing broader access to important scientific information. NASA **datasets** include biology, chemistry, environmental science, geology, and other fields related to robotic and human planetary exploration, stellar evolution, and the search for extraterrestrial life.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image238.png?raw=true" style="width: 350px; height: auto;" alt = "NASA's logo"/>

The open science practices and principles that play a critical role supporting NASA mission success are equally relevant to other government agencies and institutions. Similar considerations, approaches, and behaviors are needed in a variety of scientific contexts. **Tools for open science** frameworks and workflows **follow generally similar models**.

#### Case Study: Open Science in Action at NASA

Open science practices and principles can be applied to all stages of the research process. One early example of NASA's efforts to involve more people in science is the [exoplanet citizen science projects](https://exoplanets.nasa.gov/citizen-science/), with the **[Exoplanet Explorers](https://www.zooniverse.org/projects/ianc2/exoplanet-explorers)** being a significant part of this effort.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image266.jpg?raw=true" style="width: 100%; height: auto;" alt = "Exoplanet Explorers program results"/>

"Stargazing Live", a live television program, took place across three consecutive nights in 2017. The hosts invited viewers to identify exoplanets in an **open access dataset**. Within 48 hours of the program's debut, more than 10,000 people had participated in [Exoplanet Explorers](https://www.zooniverse.org/projects/ianc2/exoplanet-explorers) and classified over 2 million systems.

Following the first night of the program, the researchers watched the results roll in, as citizen scientists helped sift through the data. On the second night, enough people had participated that the researchers were able to share that 44 Jupiter-size candidate planets, 72 Neptune-size candidate planets, 53 sub-Neptune size candidate planets (larger than Earth but smaller than Neptune), and 44 Earth-size candidate planets had already been found and were undergoing additional analysis.

**Communities, working together on a problem, can rapidly find new results!** Open science enables this and more.

####  Submit your feedback


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Open_Science_at_NASA")

### The Internet and Open Science

Historically, factors like time, access to sufficient tools and data, and physical proximity limited who could be involved in science, as well as how easily collaboration could take place within the scientific community. More recently, digital resources like the Internet have increased participation by **eliminating barriers to entry** and presenting a platform for digital collaboration on a global scale. The internet offered people **access to the appropriate infrastructure** to conduct open science, while the practices of open science enabled **more people to engage with research products**. Unfortunately, challenges remain for people who don't have the right computational tools and/or speak the relevant languages.

The Internet creates many outlets for public hosting and **free access to research and data**. These outlets combined with **advances in computational power** enable nearly anyone to perform complex data analysis. It is now possible to connect participants, stakeholders, and outputs of open science on the Internet to make scientific processes and products easier to discover and access.

####  Submit your feedback


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_The_Internet_and_Open_Science")

### Why Should We Do Open Science Now?

Science and science communication increasingly face severe pushback from the public because of inadequacies in the reproducibility of results and the spread of misinformation, respectively, that foster mistrust. The practice of open science counteracts this by involving community feedback to validate results in a more robust manner and combats misinformation **by making results available to the public**.

#### Reproducibility Challenges

Science becomes more robust and accurate when scientists validate their colleagues' results. However, the rapidly-growing pool of published research presents an overwhelming challenge to reproduce:

- In 2011, the AAAS, publisher of Science, began requiring the authors of computational research reports to share data and software upon request.
- In 2018, a [research study](https://www.pnas.org/doi/full/10.1073/pnas.1708290115) was carried out that investigated 204 articles for reproducibility and that were published in the journal Science after 2011. It was found that **only 26% of papers were able to be reproduced**, with the two primary reasons being the inability to get access to the data and software and the fact that the methods were not described in sufficient detail.

#### Case Study: Open Results Enable Iteration and Improve Error-Detection

We will look at an example of how closed science can restrict research impact by [following the outcome](https://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1318&context=usdeptcommercepub) of a highly cited journal article to understand how science functions to inform a field’s state of research, the decisions of policymakers, and the actions of society.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image220.png?raw=true" style="width: 100%; height: auto;" alt = "The global cooling error timelime"/>

A 1990 analysis of satellite data on climate temperature concluded that the upper troposphere experienced no warming, a **finding that contradicted early climate models predictions**. Policymakers concluded from this result that researchers don't understand climate models enough to warrant changes in environmental policy. The processed data from this study were made open-access but, as was typical for the time, **neither the original data nor the code used for processing and analyzing the data were shared** by the original research team. Eight years after the article was published, other scientists noticed that the original authors didn't account for several important effects. This oversight introduced errors into the dataset and falsely produced artificial cooling to the temperature measurements. It took another five years and additional funding to reproduce the code and conduct a new analysis. **Thirteen years after the original paper, it was confirmed that the upper troposphere was warming and agreed with climate model predictions.**

*Note: Learn about the layers of Earth's atmosphere [here](https://www.sciencefacts.net/layers-of-atmosphere.html).*

**The inability for the scientific community to access an article’s original data and code slows the pace of discovery**, thirteen years in this case, and forces other research teams to repeat the work (code) instead of moving on to new projects. This isn't the pace that we want to advance science, with one step forward and two steps back to iterate and resolve problems.

The intentions of the original research group were not to conceal or prevent access to their data and methods; **the community norms at that time simply did not include the sharing of data and software openly**. This is, in part, because it allows researchers to keep a competitive advantage when seeking funding opportunities. In this case, the research group simply followed this common practice. This culture of closed science needs to be changed because the **practice of withholding code (or data or other research artifacts) can stifle scientific progress**. In the climate change example, a flawed study could have swiftly been corrected by open peer feedback but it instead undermined the credibility of climate scientists. The cost to progress on climate change research and the prevented benefit to society was enormous. It is imperative to shift the entire science ecosystem, policies, and rewards toward the prioritization of openness if the full and immediate benefits of research are to be realized.

#### Limitations of Scientific Publishing

Historically, scientific publishers have **charged subscription fees to access journals** and, often, article processing charges (APCs) to cover the costs of preparing a manuscript for press (even when the peer reviewers were volunteering their time). These practices limit both who could read papers and who could publish results.

Open access publishing has significantly increased the number of articles that are available as electronic copies online. A growing number of governments and funding agencies are starting to **mandate that research funded by taxpayers must be accessible to the public** after publication. However, the current hybrid system still does a poor job of allocating costs fairly across the research publication process (more on this in Day 5).

The issue of who has access to published papers also motivates open science. [For example](https://www.unesco.org/en/articles/can-science-be-more-equitable-so-everyone-enjoys-benefits-open-science-answer), even though more climate research is made available as open access than that from other scientific fields, the **majority of climate research articles, including many important ones, remain behind paywalls**. Climate misinformation is freely available to anyone online but scientific climate results are mostly hidden from the public behind paywalls. This practice does not increase trust in science.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image369.png?raw=true" style="width: 100%; height: auto;" alt = "Chart depicting percentage of open access publications"/>

####  Submit your feedback


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Why_Should_We_Do_Open_Science_Now")

### What is Open Science?

What is open science exactly? To illustrate, first, we’ll present a definition of open science that was developed by the U.S. [federal government](https://open.science.gov/).

> "Open Science is the principle and practice of making research products and processes available to all, while respecting diverse cultures, maintaining security and privacy, and fostering collaborations, reproducibility, and equity."

**The White House Ofﬁce of Science and Technology Policy Memo, 2022 (adapted)**

Let’s break down the definition a bit more:

- **Research products and processes should be available to all**, not just a small subset of experts, particularly if funded with public funds.
- **Research products and processes should be 'respecting diverse cultures'** – fostering an open dialogue between researchers, indigenous people, and local communities. This also means that research must respect the diversity of laws and customs in different countries and/or as they apply to different kinds of research.
- While open science is our aim, security, and privacy remain important concerns. Therefore, select **sensitive information should be protected**.
- Of the stated principles, **"Fostering collaborations, reproducibility, and equity"**, the first two are research standards, while the latter refers to the inclusion of people who might otherwise get left out.

Open science is a culture intended to promote science and its social impact.

####  Submit your feedback


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_What_Is_Open_Science")

### How Do You Do Open Science?

The Ethos of Open Science is a broad term that encompasses the moral and ethical attitudes held by individuals and institutions about practicing ‘open’ science. There is an ethical element to **sharing both new knowledge and the processes used to obtain said knowledge**. It is important to note that there is no one be-all way of practicing or conducting open science.

Diverse practices, assumptions, and goals are just part of the complexity of open science. There are also **divergent moral principles** that guide open science communities. Such principles are captured in **"codes of conduct"**. A code of conduct is a community governance mechanism that **outlines the principles and practices expected of a given research community’s members**, as well as the process for investigating and reprimanding those in violation of the code.

In a sense, a **code of conduct constitutes the moral backbone of a research community**. However, as with the numerous schools of thought, there are similarly many codes of conduct. In other words, there is no one set of universal principles that all open science
practitioners abide by. For example, consider how [OLS](https://openlifesci.org/code-of-conduct), [INOSC](https://osf.io/6gsye), [allea](https://allea.org/portfolio-item/the-european-code-of-conduct-for-research-integrity-2/), [AGU](https://www.agu.org/Plan-for-a-Meeting/AGUMeetings/Meetings-Resources/Meetings-code-of-conduct) and [Ethical Source](https://ethicalsource.dev/) all have different codes of conducts and guiding principles.

This great diversity responds to the growing proliferation of open science initiatives and the great use we can make of open science approaches to knowledge.

####  Submit your feedback


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_How_Do_You_Do_Open_Science")

### Activity 1: Think About the What and How of Open Science

In this activity, reflect on your answers to the questions.

- What does the act of open science look like? Does a scientist use or create something specific that would characterize their research as open? What comes to your mind?
- Describe how you currently share your materials (data, code, results)?
- How might you share materials in the future more openly?
- What stands in the way?

####  Submit your feedback


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Think_About_the_What_and_How_of_Open_Science")

### Fostering Collaboration, Reproducibility, and Equity

The IDEA of Open Science: Inclusive, Diverse, Equitable, and Accessible.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image256.png?raw=true" style="width: 100%; height: auto;" alt = "Open source gives credit"/>

Openly using, making, and sharing research analyses, software, or datasets gives everyone credit for their work.

Sharing is grounded in the belief that access to information and the ability to collaborate is essential for advancing scientific understanding and solving complex problems.

Open sharing enables greater transparency in the scientific process and facilitates reproducibility; it enables collaboration and inclusion of more diverse perspectives and expertise; and it makes scientific knowledge more accessible to the public.

Not only does **open sharing help society**, but it also can benefit each of us as individual researchers. It can lead to **greater visibility, impact, and credit of your results, data, and software**; it can provide **access to new collaborations and ideas**, and it can fulfill ethical and social responsibilities.

####  Submit your feedback


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Fostering_Collaboration_Reproducibility_and_Equity")

### Example of Open Science in Action: Radar Data and Climate Change

Have you ever seen weather forecast images for your location? That data comes from NEXRAD radar stations, many of which have been operating for over 30 years. The data has always been made publicly available, but can be difficult to use. It was mostly used for rain information, so stations didn’t see a need to make it readily accessible after 24 hours. Users who wanted the historical data from NEXRAD had to work through the following arduous steps:

- Go to a website.
- Make a request (but not one too large).
- Wait for a robot to read the data off tape storage and copy it online.
- Receive an email with instructions on where to download a user's data.
- Download the data.

The massive size of the dataset, more than 250TB, made it essentially impossible to do large-scale analysis. Nobody had the time to make these requests and download the data bit by bit.
           
**However, in 2015, all NEXRAD data were moved to and made freely available in the cloud. Usage of the dataset increased almost immediately!**

Researchers started using the NEXRAD data for other types of science. For example, they used <a href="https://aws.amazon.com/blogs/publicsector/the-birds-in-the-cloud-how-the-university-of-oklahoma-uses-nexrad-data-to-study-birds/">NEXRAD radar readings of birds to monitor flight patterns.</a> In particular, purple martins! Purple martins form huge roosts of up to 50,000 birds that can be tracked using radar. The purple martins perform stunning aerial performances that can now be tracked with the same technology previously reserved for rain measurements.
            
In another example of new NEXRAD uses, a NASA-led study linked variability in <a href="https://climate.nasa.gov/news/3201/climate-patterns-thousands-of-miles-away-affect-us-bird-migration/?s2=P1382021636_1683417608248277265"> bird migration to large-scale climate patterns that originate thousands of miles away</a>. The better land managers understand current migration patterns and foresee behavioral changes in these birds due to climate change, the better they can direct their conservation and habitat restoration efforts. The newly- accessible radar data provides valuable insight needed to achieve their goals. This study was funded by NASA, uses NOAA NEXRAD data, and made fully available for the first time by the AWS Public data program.

####  Submit your feedback


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Radar_Data_and_Climate_Change")

### Who Does Open Science?

As briefly discussed in previous lessons, open science doesn’t only involve researchers; many other stakeholders are affected by the outcomes of open science. Stakeholders include any individuals who can affect or be affected by open science projects.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image217.png?raw=true" style="width: 100%; height: auto;" alt = "Open science stakeholders"/>

Scientific research should benefit humanity. Although open science has many stakeholders, the advantageous interaction between science and society takes place among three core groups: **scientific researchers**, **policymakers**, and **the public**. Researchers do science and share their results with policymakers and the general public to inform their decisions and improve their lives. The public helps to fund research through taxes and can provide input to future areas of study. Policymakers help to implement measures that are informed by scientific results to improve the health, environment, and livability of society.

These three stakeholder groups remain central to the world of open science. However, the inclusive nature of open science demands participation from the broader public. Growth in public participation in science can occur by removing barriers to those historically excluded and by expanding the community of people who support scientific research itself.

Here, we list some core groups who we envision as taking part in and/or benefitting from open science while being fully aware that this list is not exhaustive and the categories we choose here have very blurred boundaries.

#### Researchers

Researchers are often thought of as the ones who do open science to benefit others. However, researchers themselves can also greatly benefit from open science. Their work can achieve higher visibility among colleagues and the public, they receive credit for a full range of activities related to their science (including time spent sharing data and code, for instance), and they have more access to datasets.

A team of supporters and collaborators enables this research to take place. Open science aims to include these supporting members of the scientific process and ensure they receive credit for their contribution to improving science.

#### Policymakers

Policymakers represent another key community in the science environment. Policymakers can reference scientific findings to inform their decisions for the betterment of society. Those who help in the understanding and dissemination of these policies (including educators and science journalists) are crucial to this process. Policymakers can also play important roles in ensuring and facilitating open science by setting data management processes, encouraging open access legislation, and developing ethical guidelines for experiments. Policymakers can benefit from open science by gaining better access to scientific output via the open sharing of research results.

#### General public

The public plays a crucial role in science today as consumers of scientific results who make decisions based on, and adhere to policies shaped by, scientific results. Open science can make scientific results, data, and workflows more accessible to the public by strengthening routes of access to trustworthy sources of information, which in turn increases trust in science. The public can also take part in open science through community science projects, for example as volunteers to collect or manage data. As a result, participants boost their understanding of science and feel empowered through opportunities to exert influence.

Open science can strengthen the connection between all of these groups. Communication between researchers and both the public and policymakers stands to drastically improve with more transparent and accessible scientific knowledge.

####  Submit your feedback


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Who_Does_Open_Science")

---

## Section 2: Why is Open Science Important?

In this section, you will learn how adopting open science benefits you as a researcher and society. You will also learn about some of the challenges and hurdles with using open science principles and how to navigate them.

---
# Summary

After completing this day, you are able to:

- Explain what open science is, why it's a good thing to do, and list some of the benefits and challenges of open science adoption.
- Describe the practice of open science, including considerations when writing a management plan and the tasks in the "Use, Make, Share" framework.
- Evaluate available options when determining whether research products should or should not be open.
- List ways to connect with others who are part of the open science community.