Skip to content

CorrPy is a Python library for automated correlation analysis and intelligent feature selection. It simplifies the process of identifying key relationships and improving model performance by handling multicollinearity and irrelevant features

Parth-Srivastava-bithub/corrpy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 CorrPY – Correlation Made Easy

PyPI version Downloads License Python


CorrPY is your lightweight buddy for fast, smart correlation analysis.
Forget just numbers — CorrPY tells you what they mean. 📊✨

Built for data scientists who want insights, not just values.


🚀 Install

pip install corrpy

📦 Quickstart

from corrpy import Corrpy

corrpy = Corrpy()
corrpy.getTotalCorrRelation(df)

✅ Analyze correlation across features
✅ Get trends + easy-to-read interpretations
✅ Go deeper with AI explanations (optional)


🔥 Key Features

  • Numerical vs Numerical — Classic correlations + strength.
  • Object vs Numerical — Category impacts, clear trends.
  • Object vs Object — Categorical association (Chi2).
  • Transitive Trap Alerts — Detect hidden indirect links. 🚨
  • AI-Generated Insights — Explain data like a boss 🧠📜

Methods

  1. getTotalCorrRelation(df, features = ["Correlation", "Pearson", "Distance"], feature = "Correlation", short = False): Pass a pandas DataFrame to get correlation analysis across all columns and get trends, interpretations and score with respect to feature u added in parameter.
  2. getGroupInf(objColumn, numColumn, df): Compute the correlation between the given object column and the given numeric column.
  3. getAllGroupInf(df): Compute the correlation between all object columns and all numeric columns.
  4. checkTransit(firstFeature, secondFeature, ThirdFeature): Check for transitive correlation between three features.
  5. checkTransitForColumn(column, df): Check for transitive correlation between a column and all other columns.

AI-Generated Insights

  1. explainTC(df, feature="Correlation", prompt="null"): Get AI insights for correlation analysis.
  2. explainShift(num1, num2, shiftValue, df, prompt="Explain like a stand-up comedian"): An AI analyst explains the output of shift() like you're in a meeting with your CEO.
  3. explainTransit(num1, num2, df, prompt="Explain like Angry Professor"): Get AI insights for transitive correlation analysis.
  4. explainTransitForcolumn(column, df, prompt="Explain like Oppenheimer"): An AI analyst explains the output of checkTransitForColumn() like you're in a meeting with your CEO.
  5. explainAI(result, prompt="Explain like angry professor"): Get AI insights for any result.
  6. makeReport(self, method="null", df=None, column=None, feature=None, target=None, prompt="Null", size="short", constant=None, first=None, second=None, third=None): Generate a human-like, well-written paragraph suitable for direct pasting into a PowerPoint slide, based on the output of other methods.

🧠 Example Insights

"Age and Fare have a moderate positive correlation.
Pclass has a strong inverse relation with Fare."

✨ Plus visual trends, interpretation tags, and more!


👨‍💻 Author

YellowForest
🔗 GitHub


📄 License

BSD 3-Clause License


⚡ TL;DR

# What CorrPY Gives You
🚀 Quick, meaningful correlation analysis
🤖 AI-driven explanations
🧩 Find hidden patterns
🔥 Detect transitive traps
🎯 Ideal for both beginners and pros

📢 FINAL NOTE:

CorrPY isn't just another EDA tool...
It's your data's best storyteller. 📚🚀


🧹 How to use:

  • README for Quick Start 📑
  • Full GUIDE.md for Deep Dive 📚

About

CorrPy is a Python library for automated correlation analysis and intelligent feature selection. It simplifies the process of identifying key relationships and improving model performance by handling multicollinearity and irrelevant features

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages