# Notes on Image Manipulation

## Background

Year: 2016
Study: Published research claimed a molecular compound could treat cancer.
Evidence: Used cell staining images to support findings.

## Issue Discovered

Investigator: Dr. Elisabeth Bik, microbiologist.
Problem: Found "problematic duplications" in the images, suggesting manipulation.

## Importance

Integrity: Accurate images are crucial for research credibility.
Vigilance: Researchers must check for and report inconsistencies.
Consequences: Manipulation can lead to retractions and loss of trust.

## Task

Exercise: Look at original images to spot duplications or manipulations.

## Conclusion

Ensuring honest and accurate use of images in research is essential for trust and progress in science.

# Notes on Reproducibility

## What is Reproducibility?

Definition: The ability for others to check your work because data, code, and methods are available.
Purpose: Enables others to repeat your steps to generate the same results or images from the data.
Importance
Ethical: Transparency and accountability in research.
Practical:
Easier to make edits.
Facilitates new work based on previous efforts.
Aids in version control.
Ethical Guidelines
Source: American Statistical Association
Principle: Good practice is based on transparent assumptions, reproducible results, and valid interpretations.

## Practical Aspects

Flexibility: Easier to modify plots/images later.
Efficiency: Reuse and build on past work.
Version Control: Track and see changes clearly.

## Key Points

Not a Guarantee of Correctness: Reproducibility doesn't mean the result is correct, but it's essential for scientific progress.
Transparency: Critical for the evolution of scientific knowledge.

# How to Make Data Visualizations Reproducible
Work Programmatically

Use code (e.g., ggplot in R, matplotlib in Python) to create figures.
Avoid manual programs like Adobe Illustrator.
Work in Plain Text

Write code in simple, plain-text formats (e.g., .py, .R, .txt).
Avoid word processors like Microsoft Word.
Comment Your Code

Explain what choices were made and why.
Makes the code easier to understand, maintain, and update.

In [None]:
# Activity 

import matplotlib.pyplot as plt
import seaborn as sns

# Load the Iris dataset
df = sns.load_dataset('iris')

# Set the plot style to 'white' for a clean look
sns.set_style("white")

# Create a KDE plot for sepal width and sepal length with red fill
sns.kdeplot(x=df.sepal_width, y=df.sepal_length, cmap="Reds", fill=True)

# Display the plot
plt.show()



# How to Write a Good Comment:

Clear and concise: Explain the purpose without unnecessary details.
Relevant: Focus on why the code is there, not just what it does.
Consistent: Use the same style and approach throughout your code.

# Activity: Searching for objectivity in data visualization



## Visualization #1: US Gun Killings in 2018

Data: Number of gun killings in the US in 2018.
Insights: Trends in gun violence, hotspots of activity, and possibly correlations with other factors (e.g., socioeconomic status).

Objective and Neutral?
Objective: If it presents raw data accurately without manipulation or selective presentation, it is objective.
Neutrality: Depends on how the data is displayed. For example, using neutral colors and avoiding alarming graphics can help maintain neutrality. If the visualization is created without bias and simply presents facts, it would be considered neutral.


## Visualization #2: Washington Post Active Shooters Graphic

Data: Incidents involving active shooters, possibly over a specific period.
Details: Locations of shootings, number of casualties, shooter profiles, and timelines.
Insights: Frequency and distribution of active shooter incidents, trends over time, and potential patterns.

Objective and Neutral?
Objective: It is objective if it uses comprehensive data from reliable sources without selective reporting.
Neutrality: The design and presentation should avoid sensationalism. If it uses clear, factual representations without dramatizing the incidents, it maintains neutrality. The use of colors, symbols, and emphasis should be carefully considered to avoid bias.

## Conclusion:
Both visualizations provide valuable insights but need careful analysis to ensure they are objective and neutral. By presenting data accurately and designing visuals thoughtfully, we can ensure the integrity and usefulness of the information provided.