Skip to content

openmymai/DS-and-ML

Repository files navigation

DataScience and Machine Learning

JupyterLab: A Next-Generation Notebook Interface JupyterLab is the latest web-based interactive development environment for notebooks, code, and data. Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning. A modular design invites extensions to expand and enrich functionality.

Pandas vs. NumPy

What is Pandas?

Pandas is defined as an open-source library that provides high-performance data manipulation in Python. It is built on top of the NumPy package, which means Numpy is required for operating the Pandas. The name of Pandas is derived from the word Panel Data, which means an Econometrics from Multidimensional data. It is used for data analysis in Python and developed by Wes McKinney in 2008.

Before Pandas, Python was capable for data preparation, but it only provided limited support for data analysis. So, Pandas came into the picture and enhanced the capabilities of data analysis. It can perform five significant steps required for processing and analysis of data irrespective of the origin of the data, i.e., load, manipulate, prepare, model, and analyze.

What is NumPy?

NumPy is mostly written in C language, and it is an extension module of Python. It is defined as a Python package used for performing the various numerical computations and processing of the multidimensional and single-dimensional array elements. The calculations using Numpy arrays are faster than the normal Python array.

The NumPy package is created by the Travis Oliphant in 2005 by adding the functionalities of the ancestor module Numeric into another module Numarray. It is also capable of handling a vast amount of data and convenient with Matrix multiplication and data reshaping.

Matplotlib: Visualization with Python

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible.

  • Create publication quality plots.
  • Make interactive figures that can zoom, pan, update.
  • Customize visual style and layout.
  • Export to many file formats.
  • Embed in JupyterLab and Graphical User Interfaces.
  • Use a rich array of third-party packages built on Matplotlib.

Seaborn

Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

Seaborn helps you explore and understand your data. Its plotting functions operate on dataframes and arrays containing whole datasets and internally perform the necessary semantic mapping and statistical aggregation to produce informative plots. Its dataset-oriented, declarative API lets you focus on what the different elements of your plots mean, rather than on the details of how to draw them.

About

DataScience and Machine Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published