Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example with financial data #195

Merged
merged 70 commits into from
Nov 14, 2023
Merged

Example with financial data #195

merged 70 commits into from
Nov 14, 2023

Conversation

gcattan
Copy link
Collaborator

@gcattan gcattan commented Oct 18, 2023

This is an example based on a patent application exploiting RG+quantum for detecting fraudulent behavior.

* add dependence to imbalanced_learn
add example with financial data

* print score
remove dead code

* add patent application number

* [pre-commit.ci] auto fixes from pre-commit.com hooks

* Update financial_data.py

* Update financial_data.py

Co-authored-by: fbarroso24 <fbarroso24@gmail.com>

---------

Co-authored-by: Gregoire Cattan <gregoire.cattan@ibm.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: fbarroso24 <fbarroso24@gmail.com>
@gcattan gcattan changed the title Slim vector (#32) Example with financial data Oct 18, 2023
@gcattan gcattan requested a review from qbarthelemy October 18, 2023 13:53
@gcattan gcattan marked this pull request as ready for review October 18, 2023 13:55
Copy link
Member

@qbarthelemy qbarthelemy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really great to have a real example on another type of data than biosignals!

examples/other_datasets/financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/financial_data.py Outdated Show resolved Hide resolved
)

##############################################################################
# Run evaluation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the non-quantum state-of-the-art methods for detecting financial fraud?
It would be good to add them in the comparison.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! May be some decision tree/random forst. @fbarroso24 what do you think?

examples/other_datasets/financial_data.py Outdated Show resolved Hide resolved
pipe,
param_grid={
"toepochs__n": [10, 20],
"xdawncovariances__nfilter": [1, 2],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should test higher values for nfilter.
What is the number of features?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only three in this example. We could add more features, but the simulation time is quite long.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With more features, classical pipeline would perform much better.
It seems unfair that time issues linked to quantum pipeline hamper the performance of classical one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I can try, and find a compromise for the pipeline afterwhat.

examples/other_datasets/financial_data.py Outdated Show resolved Hide resolved
score_qsvm = gs.best_estimator_.fit(X_train, y_train).score(X_test, y_test)

# Print the results
print(f"Classical: {score_svm} \nQuantum : {score_qsvm}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quantum pipeline gives a binary classification score of 0,5.
Flipping a coin would do the same thing... unless I missed something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is weird.
In the first version, it was 100%... but with the same data.
I need to investigate this.

@gcattan
Copy link
Collaborator Author

gcattan commented Nov 12, 2023

@qbarthelemy I made another pass on the example. There are mainly two changes:

  • I keep the NearMiss for computational reasons but increased the number of non-fraud example (that was also one of your remarks at the beginning if I remember correctly)
  • I changed the pb to "predict the type of fraud" rather than predict if the transaction is a fraud.

I also gave a tried to the halving grid search. It is quicker, but may be less accurate than the standard grid search.

Figure_1
Figure_2
image

examples/other_datasets/plot_financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/plot_financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/plot_financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/plot_financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/plot_financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/plot_financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/plot_financial_data.py Outdated Show resolved Hide resolved
examples/other_datasets/plot_financial_data.py Outdated Show resolved Hide resolved
gcattan and others added 10 commits November 14, 2023 10:48
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
@gcattan gcattan merged commit 2da3069 into pyRiemann:main Nov 14, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants