DataNinja is an all-in-one data analysis toolkit designed to simplify data manipulation, statistical analysis, and visualization. It integrates popular libraries like pandas, numpy, matplotlib, and others into a single package, making it easier for data analysts and scientists to perform common data tasks.
- DataFrame Creation: Create DataFrames with ease.
- Statistics Calculation: Compute basic statistics such as mean, median, variance, and standard deviation.
- Data Profiling: Generate comprehensive data profile reports.
- Missing Data Visualization: Visualize missing data patterns in your DataFrame.
- Missing Value Handling: Handle missing values using different methods.
- Visualization: Create various plots to analyze and visualize your data.
To install the DataNinja package, use pip:
pip install dataninja
- DataFrame Creation Create a DataFrame using the create_dataframe function:
import dataninja
data = {
'A': [1, 2, None, 4],
'B': [None, 2, 3, 4],
'C': [1, None, None, 4]
}
df = dataninja.create_dataframe(data)
print(df)
- Statistics Calculation Calculate basic statistics using the calculate_statistics function:
# Assuming df is already created
stats = dataninja.calculate_statistics(df)
print(stats)
- Data Profiling Generate a data profile report using the generate_data_profile function. This can be rendered in Jupyter Notebooks:
# Assuming df is already created
report = dataninja.generate_data_profile(df)
report.to_notebook_iframe() # For Jupyter Notebooks
- Missing Data Visualization Visualize missing data patterns with the plot_missing_data function:
# Assuming df is already created
dataninja.plot_missing_data(df)
- Missing Value Handling Fill missing values using different methods:
# Assuming df is already created
df_filled = dataninja.fill_missing_values(df, method='mean')
print(df_filled)
Here’s a complete example of using the DataNinja package:
import dataninja
# Create a DataFrame
data = {
'A': [1, 2, None, 4],
'B': [None, 2, 3, 4],
'C': [1, None, None, 4]
}
df = dataninja.create_dataframe(data)
# Calculate statistics
stats = dataninja.calculate_statistics(df)
print("Statistics:\n", stats)
# Generate data profile report
report = dataninja.generate_data_profile(df)
report.to_notebook_iframe() # For Jupyter Notebooks
# Visualize missing data
dataninja.plot_missing_data(df)
# Fill missing values
df_filled = dataninja.fill_missing_values(df, method='mean')
print("Filled DataFrame:\n", df_filled)
To contribute to the DataNinja package:
- Clone the Repository:
git clone https://github.com/ShelbyTO/DataNinja.git
- Navigate to the Project Directory:
cd DataNinja
- Install Dependencies:
pip install -r requirements.txt
- Run Tests:
pytest
- Make Your Changes and submit a pull request.
This package is licensed under the MIT License. See the LICENSE file for details.
For any issues or questions, please contact:
Author: Nicolas Prieur Email: pu-zle@live.fr