# Day 2: Open Tools and Resources

**By Neuromatch Academy & NASA**

__Content creators:__ 

__Content reviewers:__ Leanna Kalinowski, Hlib Solodzhuk

__Production editors:__ Hlib Solodzhuk, Konstantine Tsafatinos, Ella Batty, Spiros Chavlis

___

## Tutorial Objectives

*Estimated timing of tutorial: 2 hours*

This day is designed to help you get started on your journey to practicing open science. It offers an introductory view of the concepts and resources that are fundamental to open science. The bridge between the concepts and the practice of the concepts is something called the use, make, share framework. There are many methods and models that define how to get started with open science. The use, make, share framework was constructed to help you immediately assign purpose to the concepts and tools that are covered in this module as well as in the entire course curriculum. All of the information that you learn here will be addressed in more detail as you participate in other days but can also be applied immediately after completing this tutorial.

In [None]:
# @title Tutorial slides

from IPython.display import IFrame
from ipywidgets import widgets
out = widgets.Output()

link_id = ""

with out:
    print(f"If you want to download the slides: https://osf.io/download/{link_id}/")
    display(IFrame(src=f"https://mfr.ca-1.osf.io/render?url=https://osf.io/{link_id}/?direct%26mode=render%26action=download%26mode=render", width=730, height=410))
display(out)

---

## Section 1: Introduction to the Process of Open Science

In this section, you will review the definition of several common terms in the context of open science, including research products, data, software, and results. In addition, you will read examples that demonstrate how these open-science tools are used in practice. The lesson wraps up with an example of how one group openly shared their data, results, software, and paper.

### Definition of Research Products

Scientific knowledge, or research products, take the form of:

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D2_OpenToolsAndResources/static/image5.png?raw=true" alt = "Research products: Data, Code and Software, Results."/>

Within these research products are additional types of products, such as methodologies, algorithms, and physical artifacts.

#### What is Data?

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D2_OpenToolsAndResources/static/image6.png?raw=true" alt = "Data components: Measurements, Statistics, Facts, Metadata."/>

In general, data are pieces of information about a subject, including **theoretical truths**, **raw measurements**, or **highly processed values**.

There can even be **data about data, called metadata**. In our lessons, when we talk about data, we are referring to scientifically or technically relevant information that can be stored digitally and accessed electronically, such as:

- Information produced by missions and experiments, including calibrations, coefficients, and documentation.
- Information needed to validate scientific conclusions of peer-reviewed publications.

Open data can have many characteristics, including rich and robust metadata, and be made available in a range of formats. These characteristics are detailed later in this day and even further in the day on Open Data.

#### What is Code?

Many scientists write source code to produce software to **analyze data** or **model observations**. Code is a language that humans can type and understand. Software is often a collection of programs, data, and other information that a computer system uses to perform specific tasks. Scientists write and use many different types of software as part of their research.

**General Purpose Software** – Software produced for widespread use, not specialized scientific purposes. This encompasses both commercial software and open-source software.

**Operational and Infrastructure Software** – Software used by data centers and large information technology facilities to provide data services.

**Libraries** – Generic tools that implement well-known algorithms, provide statistical analysis or visualization, etc., which are incorporated into other software categories.

**Modeling and Simulation Software** – Software that either implements solutions to mathematical equations given input data and boundary conditions or infers models from data.

**Analysis Software** – Software developed to manipulate measurements or model results to visualize or gain understanding.

**Single-use Software** – Software written for use in unique instances, such as making a plot for a paper or manipulating data in a specific way.

Some of the tools that you can use to develop software are introduced in Day 4. Understanding how to find and use others' code, create your own, and share it is an important part of advancing science and is covered in the day on Open Code.

#### What are Results?

Results capture the different **research outputs** of the scientific process. **Publications are the most common type** of results, but this can include a number of other types of products. Both data and software can be considered a type of result, but when we discuss results, we will focus on other types of results. Results can include the following:

- Peer-reviewed publications
- Computational notebooks
- Blog posts
- Videos and podcasts
- Social media posts
- Conference abstracts and presentations
- Forum discussions

You may already be familiar with the research life cycle, but still unfamiliar with the types of results that can be shared openly throughout this process. When sharing results, we strive to be as open as possible, with the goal of increasing reproducibility, accessibility, and inclusion of our science. Throughout the research lifecycle, there are multiple opportunities to openly share different results that can lead to new collaborations and lines of inquiry. Additional details on the scope of open results are shared in Day 5 – Open Results.

### Using Tools for Open Science in Practice

The following subsection explores different tools and resources available to researchers for using, making, and sharing open science. As mentioned, it is important to think about how to integrate open science principles across all stages of the research process. Here is an overview of one way the various pieces might work together.

#### The Components of Open Science

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D2_OpenToolsAndResources/static/image7.png?raw=true" alt = "Components of Open Science: Data, Software, Results and Paper."/>

The four principal components of open science can be organized in a pyramid of openly-shared research products.

The research paper, closely tied to the results, sits at the top of the pyramid and summarizes how you’ve combined your software and your data to produce your results.

The practice of sharing these components can occur at varying degrees of completeness. For the following guidance on how to share components of open science, we simplify the range of completeness to "good", "better", and "best." This range reflects one’s commitment to sharing open science at all steps in the research process and to all of its products.

#### Sharing Open Data

Data can be easily shared through many different services - the best way for scientific data to be shared is often through a **long-term data repository** that will both preserve your data and make it discoverable. The image provides some of the considerations when sharing the data through [Zenodo](https://zenodo.org/), a generalist data repository. These considerations would be similar for other data repositories. See Day 3 - Open Data for more details on sharing open data.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D2_OpenToolsAndResources/static/image8.png?raw=true" alt = "Practices for Open Data."/>

#### Sharing Open Code

When sharing open code, it is often through an **online version-controlled platform** that **allows others to contribute** to the software and provides a history of changes to the software. For example, many researchers choose to post code files on [GitHub](https://github.com/) with a BSD 3-Clause license. This permits others to contribute and reuse the software. Steps to preserve code and make it discoverable are discussed in Day 4 - Open Code.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D2_OpenToolsAndResources/static/image9.png?raw=true" alt = "Practices for Open Code."/>

#### Sharing an Open Paper

Researchers can choose to publish in a **journal with an open-access license**. Researchers can search for open-access journals through the Directory of Open Access Journals (DOAJ). (See Day 5 - Open Results).

#### Sharing Open Results

When sharing results, include your **methodology** that was used to produce results (i.e. the “provenance”) **directly with your software**. Software tends to evolve with time, while the outputs of the software itself can retain some consistency. Therefore, sharing your methodology helps others to reproduce your aging results with newer software, even if the methodology to produce them can vary as the software evolves.

#### An Open Science Project Example

Here is an example of how one group openly shared their data, results, software, and paper; all with their own unique identifiers. Note that data and software can each have multiple identifiers, enabling others to cite all versions or one unique version.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D2_OpenToolsAndResources/static/image10.jpeg?raw=true" alt = "Data, Results(Paper) and Software of Particular Study."/>

[Data](https://doi.org/10.5281/zenodo.3688691), [Results](https://doi.org/10.1175/JHM-D-19-0084.1), [Software](https://github.com/c-h-david/rapid).

### Key Takeaways

In this section, you learned:

- Scientific knowledge, or research products, take the form of: data, software, and results.
- In general, data are pieces of information about a subject, including theoretical truths, raw measurements, or highly processed values.

---

## Section 2: General Tools for Open Science

This section introduces you to the commonly used tools in open science. It starts out by providing a brief introduction to open science tools and describes **persistent identifiers** - **one of the most common open science tools** in use that ensures reproducibility, accessibility, and recognition of scientific products. This is followed by descriptions of other common open science tools that are applicable regardless of your field of study. The section wraps up with a description of open science and data management plans that is a key component to sharing your science throughout the research process.

### Introduction to Open Science Tools

The word "tools" refers to any type of resource or instrument that can be used to support your research. In this sense, tools can be a collection of useful resources that you might consult during your research, software that you could use to create and manage your data, or even human infrastructure such as a community network that you join to get more guidance and support on specific matters.

In this context, open science tools are any tools that enable and facilitate openness in research, and support responsible open science practices. It is important to note that open science tools are often open source and/or free to use, but not always.

Open science tools can be used for:
- **Discovery** - Tools for finding content to use in your research.
- **Analysis** - Tools to process your research output, e.g. tools for data analysis and visualization.
- **Writing** - Tools to produce content, such as Data Management Plans, presentations, and preprints.
- **Publications** - Tools to use for sharing and/or archiving research.
- **Outreach** - Tools to promote your research.

In this lesson, we introduce you to some of the most general open science tools such as persistent identifiers, metadata, documentation, and open science and data management plans. Regardless of the field of study, these tools and practices are some of the things that you will encounter as you use, make, or share your research. Read more about open science tools on [OpenSciency](https://opensciency.github.io/sprint-content/open-tools-resources/lesson1-intro-open-science-tools.html).

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Benefits_To_You")

### Activity 2: Benefits to You

In this activity, reflect on your answers to the questions and discuss them in a group.

- Can you find your own previous work, post-publication and/or pre-publication? Can you bring your research materials (data, code, results) with you if you change institutions?
- Can you find the work of your collaborators? Of scientists in other fields that you find interesting? Have you reached out to others to collaborate with them after finding interesting results?
- Are people in your field giving and getting credit for work done?

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Activity_Benefits_To_You")

### Benefits to Science

#### Transparent Science is Reproducible Science

When computers are used to produce scientific research, the **code is considered a "method"**. Much like a lab research setting, a set of instructions for working with cells or agar plates can be considered a method. Peer-reviewed methods are an essential step in the scientific process. When these steps are not shared, no one else can reproduce the work or build upon it for future scientific endeavors. **Open methods allow people to judge whether or not the methods are trustworthy**. In Section 1, the story of the Global Cooling Error presented a poignant example of science that was not reproducible because of a lack of data transparency.

#### Open Science Can Improve Accuracy

A [study](https://www.nature.com/articles/s41562-021-01193-7) from 2022 found that researchers who practice transparency and promote verifiability benefit from readers and stakeholders who judge whether results presented are accurate and, according to a related [study](https://www.cmu.edu/dietrich/sds/docs/loewenstein/MeasPrevalQuestTruthTelling.pdf), that the results are not produced by questionable research practices that lead to misleading or unreliable results.

Open science also allows others to scrutinize the analytic decisions of researchers, such as whether the analysis was planned before or after observing the data, according to a 2018 [study](https://www.pnas.org/doi/full/10.1073/pnas.1708274114).

This allows others to check if they can arrive at the same conclusion as the original research team and facilitates stronger public trust and support, according to a 2021 UNESCO [report](https://unesdoc.unesco.org/ark:/48223/pf0000379949).

#### Case Study: Allen Brain Observatory. When Open Science Leads to More Discoveries

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/AllenInstitute.png?raw=true" alt = "Allen Institute logo"/>

Since its founding, the **Allen Institute** has made open data one of its core principles. Specifically, it has become known for **generating and sharing survey datasets** within the field of neuroscience, taking inspiration from domains such as astronomy, where such surveys are common. These survey datasets are (1) collected in a highly standardized manner with stringent quality controls, (2) create a volume of data that is much larger than typical individual studies within their particular disciplines, and (3) are collected without a specific hypothesis to facilitate a diverse range of use cases.

The **Allen Brain Observatory** consists of a set of standardized instruments and protocols designed to carry out surveys of cellular-scale neurophysiology in awake brains. Its initial focus was on neuronal activity in the mouse visual cortex. 

One of the use cases of Allen Brain Observatorty dataset in research community is **generating novel discoveries about brain function**:

- [Sweeney and Clopath, 2020](https://elifesciences.org/articles/56053) used Allen Brain Observatory two-photon imaging data to explore the stability of neural responses over time. The authors found that, indeed, population coupling is correlated with the change in orientation and direction tuning of neurons over the course of a single experiment, an unexpected result linking population activity with individual neural responses.
- [Bakhtiari et al., 2021](https://www.biorxiv.org/content/10.1101/2021.06.18.448989v3) examined whether a deep artificial neural network (ANN) could model both the ventral and dorsal pathways of the visual system in a single network with a single cost function. Comparing the representations of these networks with the neural responses in the two-photon imaging dataset, they found that the single pathway produced ventral-like representations but failed to capture the representational similarity of the dorsal areas.
- [Fritsche et al., 2022](https://www.jneurosci.org/content/42/10/1999) analyzed the time course of stimulus-specific adaptation in 2365 neurons in the Neuropixels dataset and discovered that a single presentation of a drifting or static grating in a specific orientation leads to a reduction in the response to the same visual stimulus up to eight trials (22 s) in the future. This stimulus-specific, long-term adaptation persists despite intervening stimuli, and is seen in all six visual cortical areas, but not in visual thalamic areas (LGN and LP), which returned to baseline after one or two trials. This is a remarkable example of a discovery that was not envisioned when designing our survey, but for which our stimulus set was well suited.

*Information on the case study is taken from 2023 [article](https://elifesciences.org/articles/85550).*


#### Quality and Diversity of Scholarly Communications

Furthermore, open science improves the state of scientific literature. Scientific journals have traditionally faced the severe issue of publication bias, where **journal articles overwhelmingly feature novel and positive results**, according to a 2018 [study](https://pubmed.ncbi.nlm.nih.gov/30523135/). This results in a state where scientific results in certain disciplines published may have a number of exaggerated effects, or even be "false positives" (wrongly claiming that an effect exists), making it difficult to evaluate the trustworthiness of published results, according to a 2011 and 2016 study. Open science practices, such as **registered reports, mitigate publication bias and improve the trustworthiness of the scientific literature**. Registered reports are journal publication formats that peer-review and accept articles before data collection is undertaken, eliminating the pressure to distort results, according to a 2022 [study](https://www.nature.com/articles/s41562-021-01193-7). Other open science practices, such as pre-registration, also allows a partial look into projects that for various reasons (such as lack of funding, logistical issues or shifts in organizational priorities) have not been completed or disseminated, according to a 2023 [study](https://pubmed.ncbi.nlm.nih.gov/34396837/), giving these projects a publicly available output that can help inform about the current state research.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Benefits_To_Science")

### Benefits to Society

Collaboration, innovation, education, technology advancement, and science-based public policy are all improved by the open availability of research products. **Sharing all research products** (e.g., data, code, results) makes the scientific process more transparent, which may help **increase public trust in science**. Also, open science encourages IDEA (Inclusion, Diversity, Equity, Accessibility) and **increases the involvement of citizen-scientists and non-experts** in the research process. The inclusion of diverse perspectives from an open community invites unique perspectives that contribute to a more robust and often more accurate scientific outcome.

Scientists study issues that affect every aspect of life. Yet, public interest in science remains low due to a lack of trust, understanding, and sociocultural factors. How can scientists expect the public to trust science about complex and often contentious issues, whether it is vaccine development or landing on the moon if they don’t allow the public to see the process and results? **Building trust in science is essential to a well-informed society.** Open science provides a pathway to do this.

The public who funds government research through taxes should be entitled to its results and data, as long as safety and security are not an issue. Science should be more open to ensure its insights benefit the public who enables it.

Open science introduces more scrutiny into research that helps ensure accuracy and encourages efficiency through open discourse. This approach accelerates the pace of discovery and, subsequently, the dissemination of results to the public and policymakers.

#### Open Science Can Accelerate the Pace of Science

Open science practices accelerate the pace of scientific discovery by involving ideas and labor from the broader community. The rapid response to the 
[Covid-19 Pandemic showed Open Science in action to accelerate discovery.](https://www.nejm.org/doi/full/10.1056/NEJMp2034518)

Researchers uploaded the initial genome sequence of SARS-CoV-2 into an open-access database in January 2020, creating a data-sharing precedent and metadata that would later enable insights about new COVID-19 variants. The NIH developed a dedicated platform for sharing research tools for COVID-19 and encouraged investigators to expedite reporting to ClinicalTrials.gov ahead of requirements. Open-science publishing agreements that support evidence dissemination have complemented these practices and policies. One day after the World Health Organization declared COVID-19 a public health emergency, **more than 50 academic publishers issued a joint statement committing to open-access policies for COVID-19 research**. Support for preprint servers has promoted awareness of research successes and failures, and **journals have helped accelerate the distribution of actionable information**, including by means of dedicated COVID-19 web pages, endorsement of preprints, and an emphasis on sharing data with public health authorities.

#### Open Science is Efﬁcient Science

Open science reciprocates the benefits it provides to researchers in the communities that scientists hope to serve. Data from one observation or science experiment can have unanticipated uses. In Section 1, we discussed an example where the use of radar data for tracking the effect of climate change was used to track bird migration.

Through open science practices, research waste can be avoided, such as unintentional and costly repetition of previous studies, according to a 2020 European Commission [report](https://op.europa.eu/en/publication-detail/-/publication/6bc538ad-344f-11eb-b27b-01aa75ed71a1). In the human sciences, this also reduces participant fatigue in the long term. By maximizing what is learned from publicly available data, one does not need to test repeatedly, especially on already vulnerable communities. By “giving away” science, individuals, communities, and organizations can more easily adopt research results to inform interventions for their own needs without the knowledge being gatekept by the original researchers and organizations involved. In this way, open science can strengthen the social and economic impacts of scientific results.

#### Open Science Attracts a Diverse Set of Participants

The open sharing of scientific products and processes makes science accessible to everyone. This allows full participation from everyone, and also maximizes the number of people who can benefit from the work.

The best ways to include a diverse group of open science practitioners and stakeholders are to remove existing barriers and design for inclusion. Beyond this, it is important to learn how to communicate effectively with diverse collaborators and people at different skill levels, career levels, backgrounds, and areas of expertise. The ability to build diverse teams is a skill that everyone can learn. For example, NASA has its own [commitment](https://www.nasa.gov/odeo/diversity-and-inclusion/) to diversity and inclusion.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image265.jpg?raw=true" alt = "Diversity of participants in scientific collaboration."/>

*Image credit: Andy Brunning/Compound Interest. CC BY-NC-ND 4.0 DEED*

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Benefits_To_Society")

### Key Takeaways

The following are the key takeaways from this section:

- Citing the work of other scientists whose work you build upon or reuse supports the community-minded open science practice of using, making, and sharing.
- Doing science openly can boost the visibility of research and lead to more meaningful collaborations.
- Science quality and efficiency are improved when open science best practices are followed.
- Open science helps society by allowing more people to participate in science, which increases the accuracy and impact of results.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Key_Takeaways_Section_2")

---
## Section 3: How to do Open Science

The ability to discern when and how to share information in an appropriate manner is an essential skill of open science. Practitioners of open science must balance their pursuit to maximize openness while respecting diverse cultures, maintaining security and privacy, and following institutional policies and practices.

This section introduces important security and privacy considerations for scientists when sharing information. Next, the section discusses how sharing information may impact different communities. Following this, the section explains the topic of intellectual property, how it can be protected, and the different types of licenses available to facilitate sharing while ensuring the owner of the information receives credit for their work. Lastly, this covers the effect of rules and regulations set by an organization, grant, or publisher on a scientist's options to make their research open access.

### Maintaining Security and Protecting Privacy

Previous sections have showcased a broad range of open science success stories, but we recognize that there are **still plenty of valid concerns and unexplored challenges to implementing open science**. Open science demands the valuable but complex practices of **respecting diverse cultures, maintaining security, and protecting privacy**. This lesson presents a strategic approach to making decisions about doing open science in common scenarios. For those scenarios that we cannot foresee, this lesson offers mitigation strategies to help overcome unique challenges with mindful preparation and community support.

#### A Country’s Military Secrets or Violates National Interests

When the release of data or research can lead to national security concerns, there are added restrictions around sharing this information. In the U.S., sharing of this type of information often falls under **International Traffic in Arms Regulations (ITAR)** and **Export Administration Regulations (EAR)** export control regulations. Sharing ITAR/EAR-regulated data, equipment, resources, or research without clearance to do so can put the country's national security at risk and may bring about both severe criminal and administrative penalties.

#### Human Patient Privacy

NASA has collected human spaceflight biomedical data since the start of Apollo but the only human data in the Life Sciences Data Archive are from astronauts who signed releases for their data to be public.

In the U.S., **health data** is protected under the Health Insurance Portability and Accountability Act of 1996 (US-HIPAA) and it **is not allowed to be shared without expressed written consent by the patient**. As such, health information about astronauts is something NASA protects carefully, working to balance the publicity of the job with regulations and best practices for medical privacy while also enabling peer-reviewed biomedical research.

See this example and more at NASA's [Open Science Data Repository](https://osdr.nasa.gov/bio/repo/data/studies/OSD-530/).

#### Respecting Diverse Cultures

Open Science advocates for making research widely available, while also recognizing that there are many reasons why **some information should not be released**, and that these decisions need to involve the people who provided input and/or **could be harmed by the consequences of release**.

#### Indigenous, Cultural, and Conservation Concerns

When considering the impacts of data sharing, it is important to **recognize if those affected are equally represented in the discussion**. For example, **historically excluded communities, the environment, and wildlife** are too often not considered when deciding to make research open access.

[For example](https://www.nature.com/articles/s41576-019-0161-z), while genomic research often relies on individual-based consent, it is often used to make decisions that impact indigenous communities without their consent.

Another [example](https://www.theguardian.com/us-news/2023/feb/23/lidar-technology-archeology-radical-thinking) of how data can inadvertently impact vulnerable communities is the use of LiDAR by archaeologists to study remote areas. This type of data has the potential to reveal unprotected vulnerable indigenous sites in need of protection.

#### CARE Principles

The [CARE Principles of Indigenous Data Sovereignty](https://www.gida-global.org/care) are people- and purpose-oriented, and were originally set up to use data in a way that advances data governance and self-determination among Indigenous Peoples. CARE principles can be applied by **involving communities or local stakeholders** and should be covered at the start of a research project.

#### Environmental Justice

When sharing your results, are you sharing them with the groups that are most impacted in ways that are accessible to them? When studying the impact or effect on a specific community, it is important to include that community in the design of your work and **ensure that the results of the work are accessible - both freely available and understandable – to the communities involved**.

Environmental justice is the fair treatment and meaningful involvement of all people, regardless of race, color, national origin, or income, with respect to the development, implementation, and enforcement of environmental laws, regulations, and policies. Read more about how [NASA Earth data is being made more accessible to those communities most affected](https://www.earthdata.nasa.gov/learn/backgrounders/environmental-justice). 

#### Protecting Endangered Species

Humans aren’t the only group that can be negatively impacted by data sharing. Rare and endangered species can also be impacted. For [example](https://www.rspb.org.uk/birds-and-wildlife/crane), the **sharing of breeding sites for declining wildlife populations can further exacerbate the population decline**. For this reason, rare animals may have their breeding sites kept secret.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Maintaining_Security_And_Protecting_Privacy")

### Intellectual Property

Intellectual property is the **recognition of rights associated with the content created by human intellect**. There are several different types of intellectual property and how they are recognized varies by country, type, and timescales.

It's important to understand **who has the rights to the content you create**. It can depend on a number of different factors. Work that you create may belong to your **employer**, may be in the **public domain**, may depend on the **license of underlying work**, may belong to the **publisher of your work**, or may be your **own intellectual property**. Ownership may affect how your work can be shared.

#### Most Common Types of Intellectual Property Protection

> **Copyright**

**A copyright protects original works of authorship**. This could be artistic or literary works and also applies to software. In general, and if applicable, **copyright is automatically applied at the moment of creation**, with no further registration needed.

Most open licenses depend on copyright. The person(s) who owns the copyright has the right to apply for a license.

Example: An image in a scientific journal or something from the web. Generally speaking, using copyrighted images for teaching and education is considered fair use. However, if that includes posting images to a website, that could be considered a publication and, therefore, copyright infringement.

> **Trademark**

A trademark can be applied to any content, including words, phrases, symbols, designs, or a combination of these things that identifies your product. Trademarks, in general, are not relevant for scientific purposes.

> **Patents**

**A patent is an exclusive right granted for an invention**, which is a product or a process that provides, in general, a new way of doing something, or offers a new technical solution to a problem. **Patents are another way to make your work open while protecting your intellectual property.**

Many organizations have groups that will support the development and commercialization of inventions. [NASA's Tech Transfer](https://technology.nasa.gov/) office is an example of one of these, making much of NASA's inventions available for licensing as part of the [NASA Patent Portfolio](https://technology.nasa.gov/patents).

> **Public Domain**

**In some cases, intellectual property is not protected at all.** Public domain is when a creative work has no intellectual property rights associated with it. Some types of intellectual property expire after a certain time scale. Some types of work, such as those created by civil servants in the United States, are not covered by copyright and can appear immediately in the public domain. For others, the creator donates the work to the public domain, or intellectual property rights are not applicable.

#### Why Should You Care About Intellectual Property Policies?

Why should I, as a scientist, care about this? Well, consider what happens to the ownership of your research if you move institutions:

- Can you take your paper drafts, presentations, and copies of publications with you?
- Can you take your data?
- Can you take your software?

Worrying about intellectual property and copyright can seem like an unnecessary detail early on. However, anticipating changes to your situation by ensuring permanent ownership of your work in the planning phase of your research can help you **avoid legal and institutional issues later** on.

If you submit your manuscript to a publisher that requires that they own the copyright of the work, will you be able to access that paper when you change jobs and no longer have a subscription to that work? Are you able to meet the mandates of your funding agency to openly share your work? Can you reuse the figures that you made in derivative works? Will others be able to access your work? While these may seem like questions you shouldn't have to worry about, they can become very difficult to deal with after the fact.

#### Licensing

**Licensing is a way to help to allow others to reuse your work legally**. It is a way to specify under what conditions, if any, others can use, build upon, or distribute your work. It is also a method to ensure that your work is appropriately credited. **It is generally illegal and may be a form of academic misconduct to reuse content without a license, even if the content can be found on the internet**. This law protects content creators, just as it protects your work from being used by others without clear permission. Thankfully, it’s easy to allow others to reuse your work.By applying a license to your work, you make clear what others can do with the things you're sharing and also establish the conditions under which you're providing them (such as citing you).

**If you don't license your work, others can't/shouldn't re-use it – even if you want them to.** Licenses can be applied to data, code, reports, publications, and almost any other "creative" output. There are also several different types of licenses and also the case where no license need to apply:

> **Permissive Licenses**

Permissive Licenses allow users a wide range of rights, including the **ability to use, modify, and distribute the work with no restrictions or very few**. Examples of permissive licenses would be open source software licenses such as Apache 2.0 or MIT licenses or the Creative Commons licenses such as <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution (CC-BY)</a>.

> **Protective Licenses**

Protective Licenses are a legal technique of **granting certain freedoms over copies of copyrighted works while including some limitations**. This may include copyleft licenses, commercial licenses, or other restrictions.

> **Public Domain**

Public Domain is not a license, but it is an indication that there are **no reuse restrictions on the work.** <a href="https://creativecommons.org/publicdomain/zero/1.0/">Creative Common Zero</a> is a worldwide public domain mark that indicates that the material is free to use without any restrictions.

More details about licensing for each of these types of products can be found in later days, including different types of licenses, when to apply for a license, and tools for applying for licenses. Creative Commons and the Open Source Initiative are two resources with more information on open licenses.

#### Case Study: Neuromatch Academy Licensing

**All Academy material (tutorial code, tutorial videos, lecture power point slides, etc) is published under CC-BY license.** This means the content creators are giving others permission to reuse the content. This also means that the Academy can’t publish content that it doesn’t own or that isn’t already under a CC BY license. It’s a decision Neuromatch, Inc. took to pursue the aim of facilitating inclusive, collaborative, and global participation in the computational sciences through education by making the learning content accessible and reproducible.

CC (‘Creative Commons‘) licenses are one of several public copyright licenses that enable the free distribution of content. A CC license is used when an author wants to give other people the right to share, use, and build upon a work that the author has created.
There are lots of different CC licenses. CC BY is the most generous as it “lets others distribute, remix, adapt, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of licenses offered. Recommended for maximum dissemination and use of licensed materials.”

To check the copyright license of the figures that you would like to use, you can visit the ‘Rights and permissions’ section in the header of the article from which they originated. Below there is a list of distinct options and corresponding icons:

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/nmalicense.png?raw=true" alt = "Different license options."/>

For example, it may look like the following in the context of a particular material:

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/nmacopyright.png?raw=true" alt = "Copyright example."/>

Academy can only use a figure or other content if it’s under a CC BY license. **Because Neuromatch, Inc. is an educational nonprofit and the primary use of images is for education and not for commercial use, it can use content under any CC BY license.** 

**Tip from Neuromatch team:** A rule of thumb is that any image that you yourself did not produce is likely to be subject to someone else’s copyright. Figures from most journals are not under a CC BY license. Figures from many places on the internet (except Wikipedia) are usually subject to copyright. If the content you are using is not under CC BY, the best course of action is to use a different figure with a CC BY version from bioRxiv or with a similar figure published in an open-source journal or a new figure that you create yourself.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Intellectual_Property")

### Policies and Practices around Open Science

#### Preparing to Use and Make Controlled Research

It is important to **plan for the release of your data and results from the very beginning** of your research project. Investigate and obtain all permits, approvals, and/or certifications needed to ensure you can share your research products.

**Remember:** Reputable journals and repositories will reject submissions if compliance can’t be documented!

> **Materials - sharing and commercial product agreement**

- Can be permissive or restrictive.
- Many versions are available.

> **Human or animal subject institutional review boards**

- Check experiment-specific requirements early.
- Be sure to comply with all aspects of ongoing review.

> **Collecting permits**

- Don't assume collection is allowed just because a sampling location seems unmanaged.
- Engage and consult with local communities to ensure their concerns are addressed.

#### Sharing Controlled Research

As we've previously shown, different kinds of intellectual property are released using different formal structures. It is important to understand these structures and to check with specialist communities when preparing your research plan. Methods for sharing results may follow different standards of practice or may require a special data format for distribution or submission to common repositories.

> **Creative commons vs. open source vs. public domain licenses**

- Can be permissive or restrictive.
- Many versions are available.

> **Repositories**

- General and discipline-specific options.
- Check submission requirements early.
- Often have user communities willing to help.

> **Guiding elements for selecting between options**

- Choose 'supported' versions with active and friendly communities.
- Take precautions to reduce security risk.

What are the rules for science? Before sharing, check you have the right to do so:

1. What does your supervisor or Principal Investigator say?
2. What does your grant/contract say?
3. What does your organization say?
4. What does your funding agency say?
5. If you are planning to publish, what does the publisher say?

Remember, sometimes what they say may conflict, for example:

- If your grant / funder says outputs should be open, usually your institute will permit you to share items even if they are normally more restrictive.
- Different types of outputs may have different types of restrictions. (e.g. software or hardware might have one expectation, whilst data might have others).

Universities and other institutions may have OSPOs (**Open Source Policy Office**) or commercialization offices. Most institutes will have **intellectual property counsel** to help answer questions. Librarians are another good resource to consult when looking for advice on sharing.

#### Early is Better

It is important to think about what policies may affect your research outputs as early as possible so that when you want to share information, you have either already obtained approvals or know where to go to get approvals to share. This ensures that you don’t inadvertently share (or fail to share) something that could affect your career, negatively impact others, or pose legal issues.

**Remember:** You can't unshare something that is already shared! Equally, if your research requires ethical approval or consent to share, this may be harder to gain after you’ve done your study.

#### Reusing Science Ethically - Give Credit!

It’s always important to properly source any content you use and remember to only share properly licensed content. **Even if a license does not require attribution, providing credit helps increase reproducibility by providing the provenance of your work**. This is the norm in scientific communities.

Remember when reusing science:

- **Open science is a partnership**, and giving credit is critical to make it work.
- **Consider citing all resources used**: datasets, software, infrastructure, etc.
- Hopefully, others will reciprocate when reusing your work. (Scientific ethics dictate they should).

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Policies_And_Practices_Around_Open_Science")

### Activity 3: Not all Science Can, or Should, be Open All the Time

In this activity, reflect on your answers to the questions and discuss them in a group.

- What are some reasons you would NOT want your research to be open?
- How would you balance openness with privacy/security/control?

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Not_All_Science_Can_Or_Should_Be_Open_All_The_Time")

### Key Takeaways 

In this section, you learned:

- Situations when it may be inappropriate or harmful to share your data or research. These include maintaining security, protecting privacy, and respecting diverse communities.
- What intellectual property is, who owns it, and how it is protected through licenses.
- Various organizations within science (e.g. universities, publications, funding agencies, etc.) may have their own individual sharing policies that are best to consider at the beginning of a research project to avoid any potential pitfalls along the way.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Key_Takeaways_Section_3")

---
## Section 4: When Not to Be Open

In this section, you will consider potential barriers to adopting open science practices. Barriers can come in the form of personal fears as a result of misaligned social challenges or institutional/infrastructure barriers. We begin with an exercise to identify your own concerns or fears about adopting open science. This leads to a discussion about common barriers and mitigation strategies.

### Activity 4: Self-Reflection on Open Science Concerns

In this activity, reflect on the given topic and discuss your thoughts in a group.

Take a moment to think about what fears or concerns you have about adopting open science. These could be concerns you have experienced in your work or fears you have about being more open moving forward. There are no wrong answers here – this is a time for you to reflect on what might be keeping you back from doing open science.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Self_Reflection_on_Open_Science_Concerns")

### Some Fears Around Adopting Open Science Practices

Now that you've reflected on some of your concerns or fears around open science, below we have listed a few common fears of doing open science and some potential mitigation strategies. Even if you personally don’t have this fear, it can be useful to think about the different concerns that others may have to better understand and even help others address them.

> **Mistakes:** What if my work is wrong or inelegant?

**It can be intimidating to share your research materials publicly** because someone might find a mistake or inefficiency. But isn't it better for science if we can quickly find and fix mistakes or improve quality? Peer review is a core pillar of the scientific method and is a mechanism for others to help find and correct mistakes and make improvements. To make this work, we will need to be more open to finding and fixing mistakes or inefficiencies. It's true that in many science communities, a mistake is considered a failure, or a certain style may be considered lackluster. However, **open science policies aim to change the perception of mistakes from that of failure to a step in the discovery process** that can be aided by open community feedback.

> **Scooping:** What if someone re-uses my work and gets the credit?

Yes, this can happen. **Depositing your work early and making it citable are ways to establish your work.** This serves as evidence of when you started working on it and makes it easier for others to cite you. Details of how to do this are provided in the following days. In many fields, if it is clear that someone is actively working on a problem, the decision to scoop that work may have a short term gain but long-term loss. In science, reputations are very important and being collaborative generally leads to increased career successes. Read more about scooping <a href="https://datascience.codata.org/articles/10.5334/dsj-2017-029">here</a>.

> **Misinterpretation of my work.**

This can happen regardless of the form or openness of your work - many publications have ended up being misinterpreted. **Openness does help to provide further context of the work**. Documentation of your research plan and software management practices allow others to understand your work fully, and thus help reduce the risk that others will misinterpret your work. For example, if you share code, you can **include a description of what the code does, along with brief usage instructions and examples**. In Day 4, we will discuss proper data and code documentation that can help reduce misinterpretation.

> **My work will be used, but not cited.**

Science ethics dictates that you should be cited if your work is used. Part of open science is valuing all steps of the scientific workflow, and encouraging researchers to cite code, data, or other non-published articles. **Make it easy for others to cite you by adding a digital object identifier (DOI - discussed later in the course) to your research product.** Remember to cite others' materials, so you're not adding to the problem.

> **Data is too sensitive to share.**

Following appropriate anonymization or using controlled access can address this concern.

> **I don't want to maintain or update my work.**

Sharing what you did allows others to reproduce, replicate, and build upon your work. That doesn’t mean you have to maintain it for the rest of your life, or even at all. If you don’t plan to maintain your code, it is still recommended that you share the code publicly and archive it. By adding appropriate licensing, documentation, and contributing guidelines, you can make it clear how long you plan to keep your materials maintained (if at all). In fact - others might help maintain it for you!

> **My work won't be useful to anyone else.**

You never know how materials might be used. Individuals who contributed to all different types of software projects ended up helping NASA land a rover on Mars!

*Partially drawn from [Malvika Sharan's "Ten Lessons Against Open Science You Can Win"](https://www.software.ac.uk/blog/2020-12-17-ten-arguments-against-open-science-you-can-win).*

Some of the fears listed above are not unique to open science and can occur in closed scientific systems. For example, scooping and reusing without citation are both examples of scientific misconduct that can happen in closed science scenarios. Open science practices can provide more avenues for recourse, such as making a preprint available or giving your data or code a DOI and license. **Having more of your work shared in citable ways gives you more power to prove when misconduct has occurred.**

Another example of a fear that occurs in both open and closed spaces is the commitment to maintaining your work beyond publication. Maintenance is a consideration regardless of whether your work was shared - you need to decide how long to store your data and code for yourself in order to reproduce your work, should any questions arise even after publication (we cover sharing and archiving data and code in later days, Open Data and Open Code.) By sharing your research materials, you may actually increase the longevity and impact of what you’ve done if others find your materials useful and help maintain and build on top of them.

We recognize that this is not an exhaustive list of concerns and fears toward adopting open science. This list is developed to provide guidance and instill confidence in researchers who intend to do their work more openly moving forward.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Some_Fears_Around_Adopting_Open_Science_Practices")

### Misaligned Incentives

In this section, we discuss barriers that block participation in open science that stem from misaligned incentive structures. These all relate to scientific incentives for individuals and organizations and are not aligned with open values.

We distinguish between concerns and fears, those associated with changing the culture of how we do science, from the structural barriers that block researchers' abilities to adopt open science practices. We recognize that there is overlap in these categories, but this framing might be useful for understanding what we have control of as individuals and where we need to **encourage more structural changes to our scientific ecosystem**.

Incentives can come in many forms, but most in science involve **proposal funding** and **career advancement**. In both of these cases, metrics are used for measuring scientific success (e.g., publication and citation count, as discussed earlier in this course). These current metrics do not capture the entire impact of activities that scientists spend their time doing. Below, we present a few examples of misaligned incentives. While there aren’t perfect answers for overcoming these yet, agencies like NASA and initiatives like DORA and COARA are actively working to update these metrics that define what success means in science, and it will take community action to ensure that open and inclusive practices get the merit they deserve.

> **Challenge: Overvaluing Novelty**

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image330.jpg?raw=true" alt = "Nobel prize."/>

Awards (for example, prizes or funding) are often given to those who make a big, new scientific discovery or who create a new, exciting tool. This practice overlooks the community that wrote code, curated datasets, maintained fundamental existing tools, and many other important steps that enabled these novelties.

**Prizes often disincentivize crediting a team**, since only one or a small group can be awarded a prize (for example, a Nobel Prize can be awarded to up to 3 people only). **This emphasis on novelty and the individual is starting to change, with awards being offered to groups** (e.g., The White House Office of Science & Technology Policy [Open Science Recognition Challenge](https://www.challenge.gov/?challenge=ostp-year-of-open-science-recognition-challenge)) and addition of funding solicitations offered for maintaining tools and infrastructure. However, it will take time for these changes to become the norm.

> **Challenge: It Takes More Time to be Open**

**Doing open science often requires more time and effort** from researchers to start and maintain. For instance, it can take significantly more time to document and clean code to a degree that the public can easily understand and use it. At the moment, the scientific system doesn’t always reward extra effort like this, which can make it difficult for individuals to spend their time on open activities because it takes time away from starting their next paper. After all, published papers are the main currency of the current scientific system.

Updated metrics of success can help to incentivize individuals to do their work openly. The science community is currently in a transition phase where new metrics are being developed, but the old metrics still dominate in many fields and organizations. It’s important for researchers to recognize that they might not be able to achieve complete openness until the system and culture shift.


In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Misaligned_Incentives")

### Social Barriers

Meaningful collaborations across diverse communities can require **additional time and effort to coordinate across groups** and to address conflicts. While interacting with the community can be one of the most fulfilling things about Open Science, it might also be a source of disagreements about the direction of the project or how it should be used. That’s where licenses and codes of conduct come into play. **Clear rules for community- and colleague- interactions and use of resources provide a framework to make decisions in a fair and agreed-upon manner**. This can all take additional time, especially at the beginning of a research project, but can save time and headaches down the road.

#### Strategies for Communicating Across Differences

These are ways you can encourage openness in your discussions around research. For in-person sessions, it's good to encourage discussion of these strategies:

- Presume that everyone you work with is doing the best they can at the time.
- Attempt collaboration before conflict.
- Listen carefully and actively.
- Encourage other people to listen as much as they speak.
- Practice empathy and humility.
- Ask questions that seek to understand your colleagues’ context.
- Participate in an authentic and active way that supports the health and longevity of your community.
- Exercise consideration and respect in your speech and actions.
- Treat other people's identities and cultures with respect: e.g., make an effort to say people's names correctly and refer to them by their chosen pronouns.
- Be mindful of your surroundings and of your fellow participants, and take action if you notice a dangerous situation or someone in distress.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Social_Barriers")

### Institutional and Infrastructure Barriers

#### Institutional Barriers: Institutions Often Move Slowly 

Institutional barriers to the researcher or practitioner present an additional challenge to adopting open science practices. Researchers interested in adopting open science practices might lack support from their department or project supervisors. The budget, resources, or time in a project cycle might be insufficient to practice open science. Institutions might not recognize open science practices in recruiting, training, or promoting in the organization. Even if organizations show interest in moving toward open science, they can move slowly when setting up new systems of support.

In these situations, there isn’t always an obvious mitigation strategy. While we encourage individuals to practice open science, there may be aspects that just aren’t feasible at this point in time without spending a lot of extra time and effort, time that may not be recognized or supported by your institution. It’s best to work within the bounds of the system you are in, and while the entire scientific community is in a transition phase to being more open, it may be that it doesn’t make sense to be open in every way until the institutional barriers are lowered. That said, **the more individuals that push for openness, the - more it will become part of the scientific mindset, and the more likely our organizations are to recognize and support our efforts**.

#### Tools & Infrastructure

> **Do the right tools and infrastructure exist to support my work?**

There are many tools and resources for making our code, data, and results more open, but the required infrastructure is still being built, and may not be in place yet to support open science in each discipline. This is where community input can be helpful. Perhaps there is a community already working on implementing the infrastructure you need. If not, you can **start discussions at conferences or on open online forums to help organize the creation of the tools and infrastructure you and your community need to effectively do open science**.

> **How can I get around institute-speciﬁc infrastructure when trying to collaborate with people outside my organization?**

Some of the infrastructure (like our computing platforms) is institute-specific, which can be a barrier to collaboration outside of the organization. However, by planning for open collaboration from the start, you can minimize these barriers. For example, you can use freely available tools like GitHub and Google Docs for communication and coordination, even if the computing facilities are institute-specific.

> **Open Science is Worth the Effort!**

While there are many challenges to the adoption of open science, we believe that its benefits and its ethical imperative to the self and to scientific communities, citizens, and policy-makers outweigh the cost of barriers. In addition, the recognition of barriers and areas for caution provides a first step toward resolving them.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Institutional_And_Infrastructure_Barriers")

### Key Takeaways

The following are the key takeaways from this section:

- There are valid concerns and fears around making our science more open, but there are often specific open science practices that can help to mitigate these fears.
- The misalignment of incentives creates real-world challenges that act as barriers to adopting open science practices. There are ways that individuals can minimize or work with these barriers, as well as organizations and groups that are actively working to update the incentive structure.
- Working openly and collaboratively has its challenges, but there are some strategies for communicating across differences.
- There are also institutional and infrastructure barriers to adopting open practices, but by using general tools and infrastructure, we can minimize some of these challenges.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Key_Takeaways_Section_4")

---
## Section 5: Planning for Open Science: From Theory to Practice

This day is nearly over, but there's so much more information available about open science – so our last section is for everyone who wants to learn more. In this section, you review ways to start your journey with open science, including a list of resources that you can use now.

### Planning for Open Science

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image329.png?raw=true" style="width: 60%; height: auto;" alt = "Questions to ask when planning for open science."/>

It is important to think about, discuss, and plan for desired outcomes and processes when you begin your research. Learn about where the best repositories are for your materials; discuss credit and authorship for each separate open science output, and start using open science tools to organize your work. Reach out to repositories in your discipline and institution (usually library) for help. Including this information in your plans will make you more likely to receive funding.

Planning for outputs in advance includes:

- Speaking about it and organizing with your research team;
- Deciding which tools to use;
- Thinking about authorship and credit;
- Engaging with relevant stakeholders and research partners, for example, industry, around open science;
- Identifying repositories for software and data;
- Identifying journals (or other outlets) for publications;
- Highlighting these approaches in your grant and much more.

In reality, there is an **exploratory stage** where sharing one’s product may not be part of the plan. During active research and data exploration, data, code, and ideas may be created and deleted even daily. It may not be efficient to spend time making these fully open (e.g., creating DOIs, documentation) because you are just exploring! Still, one may choose to make their code public through this process (it should be in some version control repository anyway; there is no harm in making it public). Part of this planning is beginning to think about what would be valuable to science and figuring out how you might share it.

**It is important to discuss open science with your research team, lab, group, or partners regularly**. Much of responsible open science may seem to be related to outputs – such as data, software, and publications – but preparing and organizing work for these in advance is critical. It is much more difficult to follow leading practices for these at the end of research, in the 'afterthought' mode. **Open science is both a mindset and culture** that starts when you begin a project.

#### Open Science and Data Management Plans

Federal agencies and funders consider **data management** crucial for open science because it ensures that research data is well-organized, accessible, and preserved. In recent years, many have included a **requirement as part of proposals** or projects plans for an **Open Science and Data Management Plan** (OSDMP). The OSDMP includes a description of the resources to be used, the products that will be created, how they will be shared, and who will be responsible. These plans can include the data, software, publications, and project governance.

Open science and data management plans are essential because they enhance the credibility and reproducibility of research by ensuring that data is well-documented, organized, and preserved over time. Effective OSDMPs can have the following benefits:

> **Transparency**

Not only builds trust in scientific findings but also allows other researchers to validate and build upon them, fostering a culture of openness and cooperation.

> **Effective**

Data management can lead to more efficient and cost-effective research processes. By reducing the time spent searching for and organizing data, researchers can dedicate more time to analysis and interpretation, potentially accelerating the pace of discovery and innovation.

> **Reproducibility**

A key tenet of the scientific method is reproducibility, and a well-developed OSDMP helps ensure that others are able to validate your results.

> **Preservation**

The research produced by federal funding represents a significant investment, and it is important that research is saved for future generations to access and understand.

> **Inclusive**

OSDMPs can include research tools and processes that can significantly improve research outcomes through collaboration and consultation.

You will learn more about OSDMPs in Day 2.

#### An Open Strategy

In today's world, many foundations and **agencies that award research grants** increasingly expect proposals to **include an open science strategy**. By including an open science strategy document in your scientific plan, you ensure accessibility and openness in each step of your workflow. Conclude your comprehensive plan with clearly defined steps to make research outputs easily accessible and openly available. The steps identified in your strategy should be integrated into your everyday scientific processes and practices.

> **Requirements**

Every major research foundation and federal government agency now requires scientists to file a data management plan (DMP) along with their proposed scientific research plan. Some ask for additional details on software/code and publications.

> **Include Entire Data Workﬂow Details in the Plan**

Describe your management workflow for data and related research. Other elements, such as code or a publication, have their own lifecycle and workflow which needs to be in the plan.

> **Include Open Terminology and Concepts**

Plans that are successful typically include clear terminology about how information is made findable, accessible, interoperable, and reusable. This can include licenses, repositories, formats, and governance of the project.

> **Preservation**

Research materials are valuable and reusable long after the project's financial support ends. Reuse can extend beyond our own lifetimes. Therefore, researchers must arrange steps for preservation and accessibility to ensure work is not lost after a research interaction ends.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Planning_For_Open_Science")

### Designing for Openness

#### Open Science Applies to the Entire Workﬂow

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image402.jpg?raw=true" alt = "Open Science Workflow Phases."/>

*Open Science Workflow Phases Source: Opensciency*

Regardless of your science discipline or the methodology that you use, the workflow remains relatively the same. It has a **planning phase**, an **implementation phase**, and a **release phase**. Within these phases, there are milestones that vary depending on the workflow you follow. For the purpose of our discussion in this section and the other days in the curriculum, we have adopted the scientific workflow with general milestones described in the [Opensciency](https://opensciency.github.io/sprint-content/) curriculum. The details in your workflow may vary, but the overall concepts are the same. What is relevant here is that when adopting open science, it permeates all phases of the workflow. You prepare for it in the planning phase but then continue to integrate the principles of it throughout the implementation and release phases.

Products created throughout the scientific process are needed to enable others to reproduce the findings. Researchers who wish to make their results reproducible must make key elements of their study openly available for others to test.

<img src="https://github.com/neuromatch/nasa-open-science/blob/main/tutorials/W1D1_TheEthosOfOpenScience/static/image130.jpg?raw=true" alt = "Open Science Workflow Products."/>

*Open Science Workflow Phases Source: Opensciency*

Continuing through the workflow, this updated diagram now shows the types of scientific products that are created at each milestone. The specialized products that you create may vary or be completely different, but the focus on discovery for the public remains the same. Any type of products you create can be modified to support the principles and concepts of open science.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Designing_For_Openness")

### Use, Make, Share

Here, we introduce the "Use, Make, Share" framework that can start to gradually increase your adoption of open science depending on the nature and scope of your project. Throughout the course, we will explore how this framework can be used to make your science more open!

#### What Resources Will You Use?

There are already many open science resources for you to use! Open science already has a long history. For example, the act that created NASA mandated sharing of its discoveries with all of humanity and NASA has been sharing its data openly on the internet since the 1980s. Now, there are already over 100 Petabytes of openly available NASA data for you to search, download, and use and examples of these services are provided in Day 3. Technology and practices have been developed around code that make it easy to collaborate on building complex solutions, and examples are given in Day 4. A range of services make it easy to share and discover open access publications and these are discussed in Day 5.

#### What Outputs Will You Make?

Throughout the research process, there will be different products and results produced. These can range from data sets, samples, code, reports, manuscripts, conference proceedings, blog posts, and videos. Each of these have different considerations about how to make them including how they can be made in open and collaborative ways.

There are also different ways to run a scientific project. Is your project going to be open from inception or open at publication? There are valid reasons for both approaches, but generally the earlier you are open with data, code, and results, the more opportunities there are to grow collaboration networks and build with others (which is quite fun). Often researchers choose to be open within their project teams during development, exchanging data, code, and results, but then only sharing with the world once they feel they have a result they can trust. While this approach has been the cultural ‘norm’ within many communities, this is changing as groups grow more comfortable with openness earlier in projects and experience valuable contributions from others and build new collaboration networks.

Days 3, 4, and 5 will discuss how to make your data, code, and results open.

#### How Will You Share?

Where you choose to share your research materials and results will have a large influence on its impact – how easy it is for others to find it, how long it is available, and how easy it is to reuse.

Will you share data in a file filled with columns of unlabeled numbers without any units or explanations, or will it be in an open, standard format and following the [Findable, Accessible, Interoperable, Reusable (FAIR) principles](https://www.go-fair.org/fair-principles/)? Day 3 has more details to help you better understand how to share your data and explains ideas like FAIR and best practices in sharing data. This includes different considerations for where to share your data as well so that it is both accessible and preserved.

For software, since it is often updated and changed, many researchers first share it on a version control platform like GitHub or GitLab but then archive a version of it in a repository that has long-term preservation capabilities – more on this in Day 4!

For results, open access publications and preprint servers are common locations to share. Day 5 discusses all these options.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Use_Make_Share")

### Activity 5: Use, Make, Share

Take a moment, to answer the following questions on your current research or on research that you would like to do:

- What data, software, or publications do you currently use or would like to use? Are they open or closed?
- What are the tools and processes that you currently use? Is it easy to include others in collaboration?
- How is your work shared or planned to be shared? Can anyone access your results?

Discuss the answers in the group.

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Activity_Use_Make_Share")

### Steps to Continue Your Open Science Journey

Here, we will explore the next steps to open science that everybody can take. The thought that open science can impact your entire scientific workflow may seem overwhelming and unachievable, but this is not the case. You can start slowly and gradually increase your adoption depending on the nature and scope of your project. Here are a few immediate ways that you can start engaging in open science.

#### Where to Go From Here

- Get involved: Become part of an open science community in your sector.
- Start using/sharing the open science tools of your community.
- Learn how to use/archive data in repositories and community tools and resources.
- Concise statement of the Ethos of Open Science: Find, collaborate, and share!

#### Identify Your Open Science Communities

Here are the steps you can take to find your own science community:

- Talk with your colleagues.
- Read your field’s literature.
- Run searches, in general and discipline-specific areas.
- Investigate online communities encouraging open science, such as:
  - The EU's '[Foster Open Science](https://www.fosteropenscience.eu/)' program 
  - [The Turing Way online manual](https://the-turing-way.netlify.app/index.html) 
  - [FORRT](https://forrt.org/)

Join open science communities. There are generic ones as listed here or you can seek out communities that are not only within your domain but also within your geographical area.

- [TOPS GitHub discussion board](https://github.com/nasa/Transform-to-Open-Science/discussions)
- Opensciency online open science [community list](https://opensciency.github.io/sprint-content/open-tools-resources/lesson5-open-science-communities.html#communities-of-practice-list)

#### Explore Open Repositories

There are many repositories that host open data, software, and results. We share many of these resources in the later modules, but here are two NASA repositories that allow you to search for existing data collections that might be relevant to your interests.

- [Science Discovery Engine](http://science.data.nasa.gov/search)
- [https://data.nasa.gov/](https://data.nasa.gov/)

#### Four Steps to Open Science that Anyone Can Take

1. Keep seeking best practices for open science, and develop plans to be more open in your science or research.
2. Think about all the different types of reviews you are involved with, and how to improve them with a goal of openness.
3. Ask colleagues about open science activities, and award credit for them in evaluations.
4. Engage with underrepresented communities to ensure science encourages a more equitable, impactful, and positive future.

#### Additional Resources

In addition to the resources listed elsewhere in this training, the community resources below are excellent sources of information about Open Software.

**Disclaimer:** Please note that we reference several papers throughout the course, and depending on the paper, it might be blocked by a paywall. If you would like to get a copy of the paper, please contact the Author or search for it in an online preprint archive. For example, [bioRxiv.org](http://biorxiv.org/).

- [OpenSciency](https://opensciency.github.io/sprint-content/)
- NASA SMD's [Open-Source Science Guidance for researchers](https://smd-cms.nasa.gov/wp-content/uploads/2023/08/smd-open-source-science-guidance-v2-20230407.pdf)
- Turing [Way handbook to reproducible, ethical and collaborative data science](https://the-turing-way.netlify.app/index.html)

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Steps_To_Continue_Your_Open_Science_Journey")

### Key Takeaways

There is no one way of doing open science, and any steps you take to make your science more open are extremely valuable, especially as we transition to a more open scientific ecosystem in the future. We want people to be able to identify the most important things they "can" openly share, but with the ultimate goal of complete openness.

The following are the key takeaways from this section:

- Preparing and organizing in advance are crucial components for ensuring the effectiveness of open science work.
- Open Science and Data Management Plans (OSDMP) provide a plan for how open science is integrated into a project, including the sharing of data, software, and results.
- Designing for openness is a critical aspect of making sure that open science is integrated into the entire scientific workflow from start to finish. This includes resources that can be used, products that will be made, and how the science will be shared.
- Open Science is already happening - there are already teams conducting their research openly and many resources that can be used to make your research more open.
- There are more opportunities to participate and learn about Open Science – this is just the start!

In [None]:
# @title Submit your feedback
content_review(f"{feedback_prefix}_Key_Takeaways_Section_5")

---
# Summary

After completing this day, you are able to:

- Explain what open science is, why it's a good thing to do, and list some of the benefits and challenges of open science adoption.
- Describe the practice of open science, including considerations when writing a management plan and the tasks in the "Use, Make, Share" framework.
- Evaluate available options when determining whether research products should or should not be open.
- List ways to connect with others who are part of the open science community.