# **Theory:**

Exploring the top data analytics libraries in Python reveals a selection of powerful tools designed to streamline scientific computing, data manipulation, machine learning, and web scraping tasks. Similarly, the realm of R offers a rich ecosystem of packages tailored for data manipulation, visualization, machine learning, and web application development. Below, we delve into the features and functionalities of the leading libraries in both Python and R.

##**Top 5 Data Analytics Libraries in Python.**


Numpy:

*   At the forefront, NumPy emerges as a fundamental package for scientific computing in Python.
*    It furnishes a robust N-dimensional array object along with an array of functions for efficient array operations, encompassing mathematical, logical, and statistical operations, among others.




Pandas:

*   Pandas, another cornerstone in Python's data analysis arsenal, furnishes fast and flexible data structures, including the DataFrame object.
*   Engineered to handle relational or labeled data effortlessly, Pandas simplifies data loading, alignment, and manipulation tasks.




Scikit-Learn:

*   Scikit-Learn stands as an open-source machine learning library offering support for both supervised and unsupervised learning techniques.
*    It provides a suite of tools for model fitting, preprocessing, selection, and evaluation, rendering it accessible and adaptable across diverse contexts.




Keras:

*   For deep learning aficionados, Keras emerges as a premier library, facilitating the development of neural network models with its consistent API and support for multiple backends, including TensorFlow and Theano.
*    Its user-friendly interface and scalability make it an attractive choice for CPU and GPU-based computations.




BeautifulSoup:

*   Rounding off the list, BeautifulSoup emerges as a quintessential tool for parsing HTML and XML documents.
*   By creating parse trees and offering intuitive methods for content extraction and navigation, BeautifulSoup simplifies the often intricate process of web scraping.




## **Top 5  Data Analytics Librariesin R:**

dplyr:

*   In the realm of R, dplyr emerges as a potent package for manipulating and summarizing unstructured data.
*   By furnishing a suite of simple yet powerful functions, dplyr streamlines common data manipulation tasks, ensuring efficiency and ease of use.




ggplot2:
* Leveraging the grammar of graphics, ggplot2 empowers R users to construct visually compelling graphs by expressing relationships between data attributes and graphical representations in a coherent manner, thus expediting the creation of plots.

lubridate:
* Addressing the limitations of R's native date and time handling capabilities,
* lubridate offers a comprehensive suite of functions designed to simplify date and time manipulation tasks, enhancing the efficiency and clarity of code.

mlr:
* Serving as a generic and extensible framework for classification, regression, and clustering tasks in R,
* mlr integrates over 160 basic learners and includes meta-algorithms and model selection techniques to augment the functionality of basic learners.

shiny:
* shiny emerges as a groundbreaking R package for building interactive web applications directly from R code.
* By adopting a reactive programming paradigm and offering a curated set of user interface functions, shiny empowers data scientists to develop sophisticated web apps without delving into the intricacies of web development technologies.








---

## **Python**

In [None]:
import numpy as np
array1D = np.array([2,3,4,5])
print(f"1D Array : {array1D}, Shape : {array1D.shape}")

array2D = np.array([[1,2,3],[4,5,6]])
print(f"2D Array : \n{array2D}, Shape : {array2D.shape}")

1D Array : [2 3 4 5], Shape : (4,)
2D Array : 
[[1 2 3]
 [4 5 6]], Shape : (2, 3)


In [None]:
import pandas as pd
data = {
  "calories": [420, None, 390],
  "duration": [50, 40, 45]
}
df = pd.DataFrame(data)
missing = np.where(df["calories"].isnull() == True)
print(df)
print(missing)

   calories  duration
0     420.0        50
1       NaN        40
2     390.0        45
(array([1]),)




---

## **R**

In [None]:
library(dplyr)
library(ggplot2)
library(dplyr)
df = data.frame(
    department = c("AIDS","CS","IT","EXTC"),
    year = c(2014,2016,2006,2001)
)
df.department = arrange(df,year)
df.department


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union




department,year
<chr>,<dbl>
EXTC,2001
IT,2006
AIDS,2014
CS,2016




---
# **Conclusion:**
* Identified the Data Analytics Libraries in Python and R
* Performed simple experiments with these libraries in Python and R
