Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Emissions Option for Matching Result Validation #18

Closed
5 tasks done
tikhomirovd opened this issue Jan 31, 2024 · 0 comments · Fixed by #35
Closed
5 tasks done

[FEATURE] Emissions Option for Matching Result Validation #18

tikhomirovd opened this issue Jan 31, 2024 · 0 comments · Fixed by #35
Assignees
Labels
enhancement New feature or request

Comments

@tikhomirovd
Copy link
Collaborator

🚀 Feature Proposal: Emissions Option for Matching Result Validation

Motivation

In the field of data analysis, particularly in matching scenarios, understanding the impact of extreme values (outliers) on the overall result is crucial. The current setup in HypEx lacks a direct way to evaluate how the results vary before and after the removal of outliers. The introduction of the "emissions" option aims to fill this gap. This feature will allow analysts to assess the extent to which outliers influence the matching results, ensuring more robust and reliable data analysis.

Feature Description

The "emissions" feature is a new option added to the Matching Result Validation process in HypEx. This feature provides a comparative analysis between the results of matching before and after the removal of outliers. The core functionality includes:

  • Calculation of matching results with all data points, including outliers.
  • Recalculation of matching results after removing outliers.
  • Generation of a comparative report or metric that highlights the differences in results due to outliers.

This feature would be particularly useful in scenarios where data integrity and accuracy are paramount, and outliers may significantly skew the results.

Potential Impacts

  • Performance Considerations: The additional calculations may slightly increase the processing time, especially for large datasets.
  • Compatibility Issues: Should be backward compatible; however, it must be ensured that it integrates seamlessly with existing matching algorithms and validation processes.
  • Dependencies: Relies on the existing outlier detection and removal mechanisms within HypEx.

Alternatives

An alternative approach could be to provide enhanced reporting and visualization tools that allow users to manually inspect the impact of outliers. However, this would be less efficient and more time-consuming compared to an automated "emissions" feature.

Additional Context

This feature is in response to the need for more nuanced data analysis tools within HypEx, especially in situations where outliers can significantly alter the outcome of data matching processes.

Checklist

  • I have clearly described the feature.
  • I have outlined the motivation for the proposal.
  • I have provided a detailed description of the feature.
  • I have discussed potential impacts and alternatives.
  • I have added any additional context or screenshots.
@tikhomirovd tikhomirovd added the enhancement New feature or request label Jan 31, 2024
@tikhomirovd tikhomirovd added this to the v0.1.0 milestone Jan 31, 2024
@tikhomirovd tikhomirovd self-assigned this Jan 31, 2024
@tikhomirovd tikhomirovd linked a pull request Mar 1, 2024 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant