Skip to content

Files

Latest commit

ed5070f · Sep 10, 2024

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Nov 14, 2019
Mar 22, 2023
Sep 10, 2024
Nov 14, 2019
Dec 1, 2023
Sep 10, 2024
Dec 1, 2023
May 7, 2021
Sep 10, 2024
Dec 16, 2020
Dec 1, 2023
Mar 21, 2024
Dec 1, 2023
Mar 21, 2024
Dec 1, 2023
Dec 1, 2023
Mar 21, 2024

Pandas

pandas is a library that defines three data structures and algorithms that are useful in the context of data analysis and data science. It represents Series, DataFrame, and Panel, or 1D, 2D, and 3D arrays. DataFrame is especially useful, and defines methods such as pivot_table, and query, and has many facilities to deal with missing data.

For analysis purposes, pandas has some nice plotting features that are easy to use.

What is it?

  1. agt_analysis.ipynb: a notebook illustrating the analysis and visualization of water levels as measured by variouus sensors.
  2. agt_data: three CSV files using in the notebook.
  3. data_generation.ipynb: notebook that generates some simulated gene expression data using numpy and 'pandas`.
  4. pandas_intro.ipynb: illustrates various aspects of using pandas such as importing data, using Series, DataFrame, cleaning and formatting data, dealing with missing data, adding and removing columns, and various algorithms and visualizations.
  5. data: some data sets used in the notebook above.
  6. patients.ipynb: runninng example used in the Python slides.
  7. patient_data.ipynb: extended version of therunninng example used in the Python slides.
  8. pipes.ipynb: consolidating data processing using pipes.
  9. screenshots: screenshots made for the slides.
  10. generate_csv_files.py: script to generate CSV files in different formats.