<a href="https://colab.research.google.com/github/zia207/R_Beginner/blob/main/Notebook/01_04_04_data_visualization_plotly.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![alt text](http://drive.google.com/uc?export=view&id=1bLQ3nhDbZrCCqy_WCxxckOne2lgVvn3l)

# Interactive Data Visualization with the `plotly`


The `plotly` package in R provides a powerful framework for creating interactive, web-based visualizations. It builds on the JavaScript Plotly library, offering dynamic plots like scatter, line, bar, and 3D charts that support zooming, panning, and tooltips. This tutorial introduces `plotly` for beginners, using built-in datasets like `iris` and `mtcars`. We'll cover basic plots, customization, and interactive features, assuming basic R knowledge.




## Installation and Setup

Install the packages from CRAN if you haven't already:

```r
install.packages(c("plotly"))
```



## Setup R in Python Runtype - Install {rpy2}

{rpy2} is a Python package that provides an interface to the R programming language, allowing Python users to run R code, call R functions, and manipulate R objects directly from Python. It enables seamless integration between Python and R, leveraging R's statistical and graphical capabilities while using Python's flexibility. The package supports passing data between the two languages and is widely used for statistical analysis, data visualization, and machine learning tasks that benefit from R's specialized libraries.

In [3]:
!pip uninstall rpy2 -y
!pip install rpy2==3.5.1
%load_ext rpy2.ipython

Found existing installation: rpy2 3.5.1
Uninstalling rpy2-3.5.1:
  Successfully uninstalled rpy2-3.5.1
Collecting rpy2==3.5.1
  Using cached rpy2-3.5.1-cp312-cp312-linux_x86_64.whl
Installing collected packages: rpy2
Successfully installed rpy2-3.5.1


##  Mount Google Drive

Then you must create a folder in Goole drive named "R" to install all packages permanently. Before installing R-package in Python runtime. You have to mount Google Drive and follow on-screen instruction:

In [4]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Check and Install Required R Packages

In [5]:
%%R
packages <- c(
          'tidyverse',
          'plotly'
)

In [None]:
%%R
# Install missing packages
new.packages <- packages[!(packages %in% installed.packages(lib='drive/My Drive/R/')[,"Package"])]
if(length(new.packages)) install.packages(new.packages, lib='drive/My Drive/R/')

# Verify installation
cat("Installed packages:\n")
print(sapply(packages, requireNamespace, quietly = TRUE))

Installed packages:
     lattice latticeExtra       mlmRev 
        TRUE         TRUE         TRUE 


## Load Packages

In [6]:
%%R
# set library path
.libPaths('drive/My Drive/R')
# Load packages with suppressed messages
invisible(lapply(packages, function(pkg) {
  suppressPackageStartupMessages(library(pkg, character.only = TRUE))
}))

In [None]:
%%R
# Check loaded packages
cat("Successfully loaded packages:\n")
print(search()[grepl("package:", search())])

Successfully loaded packages:
 [1] "package:dplyr"     "package:ggplot2"   "package:tools"    
 [4] "package:stats"     "package:graphics"  "package:grDevices"
 [7] "package:utils"     "package:datasets"  "package:methods"  
[10] "package:base"     


## Basic Scatter Plot

Create a scatter plot using the `iris` dataset to visualize Sepal Length vs. Sepal Widt.

The `plot_ly()` function is the core of `plotly`, using a formula interface (e.g., `x = ~variable`) to specify data. The `%>%` or `|>` pipe operator (from `magrittr`) chains commands for customization.


### Scatter Plots with `xyplot()`

The `xyplot()` function creates scatter plots, ideal for two continuous variables.

In [7]:
%%R -w 600 -h 400
p1 <- plot_ly(data = iris,
              x = ~Sepal.Length,
              y = ~Sepal.Width,
              type = "scatter",
              mode = "markers",
              marker = list(size = 10),
              text = ~paste("Species:", Species),
              hoverinfo = "text+x+y") %>%
  layout(title = "Sepal Length vs. Width",
         xaxis = list(title = "Sepal Length"),
         yaxis = list(title = "Sepal Width"))

# Display the plot
p1

## Scatter Plot with Grouping

Add color grouping by species:

In [9]:
%%R -w 600 -h 400
p2 <- plot_ly(data = iris,
              x = ~Sepal.Length,
              y = ~Sepal.Width,
              color = ~Species,  # Color by species
              type = "scatter",
              mode = "markers",
              marker = list(size = 10),
              text = ~paste("Species:", Species),
              hoverinfo = "text+x+y") %>%
  layout(title = "Sepal Length vs. Width by Species",
         xaxis = list(title = "Sepal Length"),
         yaxis = list(title = "Sepal Width"),
         showlegend = TRUE)

p2

- `color = ~Species` assigns colors to each species.
- `showlegend = TRUE` adds a legend.

## Bar Plot

Create a bar plot of cylinder counts from the `mtcars` dataset:


In [10]:
%%R -w 600 -h 400
mtcars_cyl <- table(mtcars$cyl)
cyl_data <- data.frame(cyl = names(mtcars_cyl), count = as.numeric(mtcars_cyl))

p3 <- plot_ly(data = cyl_data,
              x = ~cyl,
              y = ~count,
              type = "bar",
              marker = list(color = c("#1f77b4", "#ff7f0e", "#2ca02c"))) %>%
  layout(title = "Distribution of Cylinders in mtcars",
         xaxis = list(title = "Number of Cylinders"),
         yaxis = list(title = "Count"))

p3

- `table()` and `data.frame()` prepare the data.
- `marker` sets custom bar colors.:

## Line Plot

Plot a time series of unemployment from the `economics` dataset (available in `ggplot2`):

In [11]:
%%R
library(ggplot2)
data(economics)

p4 <- plot_ly(data = economics,
              x = ~date,
              y = ~unemploy,
              type = "scatter",
              mode = "lines",
              line = list(color = "#d62728", width = 2)) %>%
  layout(title = "US Unemployment Over Time",
         xaxis = list(title = "Year"),
         yaxis = list(title = "Unemployment"))

p4

- `mode = "lines"` creates a continuous line.
- `line` customizes color and width.


## 3D Scatter Plot

Create a 3D scatter plot with `iris` data:.



In [12]:
%%R -w 800 -h 600
p5 <- plot_ly(data = iris,
              x = ~Sepal.Length,
              y = ~Sepal.Width,
              z = ~Petal.Length,
              color = ~Species,
              type = "scatter3d",
              mode = "markers",
              marker = list(size = 5)) %>%
  layout(title = "3D Scatter Plot of Iris Data",
         scene = list(xaxis = list(title = "Sepal Length"),
                      yaxis = list(title = "Sepal Width"),
                      zaxis = list(title = "Petal Length")))

p5

- `type = "scatter3d"` enables 3D plotting.
- `scene` customizes 3D axes.


## Box Plot

In [13]:
%%R
data(quakes)  # Load quakes dataset
# Create depth bins
quakes$depth_bin <- cut(quakes$depth, breaks = seq(0, 700, by = 100),
                        labels = paste0(seq(0, 600, by = 100), "-", seq(100, 700, by = 100), " km"))

# Box plot
p2 <- plot_ly(data = quakes, x = ~depth_bin, y = ~mag, type = "box",
              color = ~depth_bin, colors = "Set2") %>%
  layout(title = "Box Plot of Earthquake Magnitude by Depth Bin",
         xaxis = list(title = "Depth Bin (km)"),
         yaxis = list(title = "Magnitude (Richter)"),
         showlegend = FALSE,
         width = 700,  # Width in pixels
         height = 500)

p2

## Heatmap

In [14]:
%%R -w 700 -h 700
p1 <- plot_ly(data = quakes,
              x = ~long,
              y = ~lat,
              type = "histogram2d",
              nbinsx = 20, nbinsy = 20,
              colorscale = "Viridis") %>%
  layout(title = "Heatmap of Earthquake Locations",
         xaxis = list(title = "Longitude"),
         yaxis = list(title = "Latitude"),
         width = 700,  # Width in pixels
         height = 500,  # Height in pixelsS
         yaxis = list(title = "Latitude"))

p1

# Contour Plot

```r
p3 <- plot_ly(data = quakes,
              x = ~long,
              y = ~lat,
              z = ~depth,
              type = "contour",
              contours = list(showlabels = TRUE),
              colorscale = "Hot") %>%
  layout(title = "Contour Plot of Earthquake Depth",
         xaxis = list(title = "Longitude"),
         yaxis = list(title = "Latitude"))

p3
```

##  Subplots

Combine multiple plots into a single figure:

In [15]:
%%R -w 600 -h 400
p6 <- subplot(p2, p3, nrows = 2, shareX = FALSE) %>%
  layout(title = "Combined Scatter and Bar Plots",
         showlegend = TRUE)

p6

- `subplot()` stacks plots vertically (`nrows = 2`).
- `shareX = FALSE` allows independent x-axes.

## Adding Annotations

Add text annotations to a scatter plot:




In [16]:
%%R -w 800 -h 700
p7 <- plot_ly(data = iris,
              x = ~Sepal.Length,
              y = ~Sepal.Width,
              type = "scatter",
              mode = "markers") %>%
  layout(title = "Scatter Plot with Annotation",
         xaxis = list(title = "Sepal Length"),
         yaxis = list(title = "Sepal Width"),
         annotations = list(
           list(x = 6, y = 4,
                text = "Note: Zoom or pan to explore!",
                showarrow = TRUE,
                arrowhead = 2,
                ax = 20, ay = -30)))

p7


- `annotations` adds text with an arrow at coordinates (6, 4).

## Exporting a Plot

Save an interactive plot as an HTML file:
▶

> htmlwidgets::saveWidget(p2, "iris_scatter.html")

## Best Practices

- **Data Preparation**: Use data frames. Summarize data with `table()` or `dplyr` for bar plots or aggregations.
- **Interactivity**: Leverage `plotly`’s zoom, pan, and hover features for data exploration.
- **Customization**: Use `layout()` for titles, axes, and annotations; `marker` or `line` for styling points and lines.
- **Performance**: For large datasets, sample or aggregate to improve rendering speed.
- **Resources**: Explore `?plot_ly` or the [Plotly R documentation](https://plotly.com/r/) for advanced features like heatmaps, box plots, or animations.


## Summary and Conclusions

This `plotly` tutorial demonstrated creating interactive visualizations in R using the `quakes` dataset (1000 seismic events near the Tonga Trench). It covered heatmap (`histogram2d`), box, and contour plots with `plot_ly()`, plus size adjustments via `layout(width, height)`. Examples showed earthquake location density, magnitude distributions by depth bins, and depth contours across latitude and longitude. The tutorial fixed errors (e.g., `hist2d` to `histogram2d`) and provided best practices for data preparation, interactivity, and performance.

`plotly` excels at creating interactive, web-based plots with zooming, panning, and tooltips, surpassing `lattice` for dynamic visualizations. It’s ideal for exploring multivariate data like `quakes` but requires careful binning or subsampling for large datasets. The formula interface and customization options make it versatile, complementing static tools like `lattice` or `ggplot2`.


## Resources

- [Plotly R Documentation](https://plotly.com/r/) for plot types and options.
- `?plot_ly` for function details.
- [Plotly R Layout Guide](https://plotly.com/r/reference/layout/) for sizing and styling.

