Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV with the dataset from the deernet paper #673

Closed
George3d6 opened this issue Oct 22, 2021 · 15 comments · May be fixed by mindsdb/benchmarks#47 or mindsdb/benchmarks#48
Closed

CSV with the dataset from the deernet paper #673

George3d6 opened this issue Oct 22, 2021 · 15 comments · May be fixed by mindsdb/benchmarks#47 or mindsdb/benchmarks#48
Labels
good first issue Good for newcomers hacktoberfest Contirubte to Lightwood and participate in Hacktoberfest. test Adding or modifying some tests or testing-methodology

Comments

@George3d6
Copy link
Contributor

I loved this paper: https://arxiv.org/pdf/2106.07465.pdf and I'd love to add the dataset they are using to our benchmark.

However, I'm unsure how to use the physics library required to generate the data and would rather just have a simple CSV with the data (csv columns can contain arrays if need be, but I don't think this will be required here).

Feel free to PR this csv into https://github.com/mindsdb/benchmarks

This will count 3 points towards the hacktoberfest deep learning laptop raffle.

@George3d6 George3d6 added good first issue Good for newcomers test Adding or modifying some tests or testing-methodology hacktoberfest Contirubte to Lightwood and participate in Hacktoberfest. labels Oct 22, 2021
@Ashutosh619-sudo
Copy link

Ashutosh619-sudo commented Oct 25, 2021

how are we going to get the CSV though?

@George3d6
Copy link
Contributor Author

I don't know, that's literally the issue here, I want to figure out where to source this data from and put it in a simple csv format.

2 similar comments
@George3d6
Copy link
Contributor Author

I don't know, that's literally the issue here, I want to figure out where to source this data from and put it in a simple csv format.

@George3d6
Copy link
Contributor Author

I don't know, that's literally the issue here, I want to figure out where to source this data from and put it in a simple csv format.

@Ashutosh619-sudo
Copy link

I read the research paper but I didn't find anything regarding the data availability.

@LyndonFan
Copy link
Contributor

LyndonFan commented Oct 25, 2021

I found the data at their website: http://spindynamics.org/group/?page_id=12

It does require Matlab and some specific toolboxes -- see their Installation page.

I'm not too familiar with what the data needed is, though. Would just the data used to generate the plots be enough? (i.e. "signal strength"(?) over time)

Also, there seem to be multiple datasets mentioned at the end of the paper. I am happy to generate the csv files for all of them (except for the Samples, which don't seem to be stored in a file).

@MichaelLantz
Copy link
Contributor

MichaelLantz commented Oct 26, 2021

Sorry for not jumping into the conversation sooner. This took me way way way longer than I anticipated. I wanted to contribute once I had good news.

@LyndonFan findings appear to parallel some of my findings. Ideally it sounds like not only would Matlab be ideal, but it also requires a pretty beefy workstation to keep up.

Fortunately I found an alternative (ComparativeDeerAnalyzer) which comes embedded with a free version of Matlab. I've spot checked some of the plotting and it seems fairly close to the cited research paper. I've added some of the distribution data in a pull request for now to get things started.

@MichaelLantz
Copy link
Contributor

I should mention there's fit data generated from the ComparativeDeerAnalyzer as well. Not sure if that warrants another issue/pull-request but if this output is the direction we're looking for I can keep plugging away.

@MichaelLantz
Copy link
Contributor

MichaelLantz commented Oct 26, 2021

In addition I wasn't able to get all of the datasets using ComparativeDeerAnalyzer as I ran into execution issues in approximately 3 of the 13 datasets referenced in the supplementary information I was able to find. For those who will attempt the same thing I was getting the following message before the program crashed:

Network set: purely dipolar modulation, arbitrary distance distribution.
Uncertainty estimate 10 out of 55...
Uncertainty estimate 20 out of 55...
Uncertainty estimate 30 out of 55...
Uncertainty estimate 40 out of 55...
Uncertainty estimate 50 out of 55...

Exiting: Maximum number of function evaluations has been exceeded

@MichaelLantz
Copy link
Contributor

MichaelLantz commented Oct 28, 2021

In my previous comments I had been utilizing example input files (DTA's) cited from the research paper from DeerAnalyzer 2021b to generate output from ComparativeDeerAnalyzer

So I've found the other referenced sample input datasets (included as .dat's instead of DTA's) included in the spinach under examples\data_paper_kuprov.

Seeing if I can install an evaluation version of Matlab utilizing the methods from the research paper to see how it compares with ComparativeDeerAnalyzer and potentially DEERLab. This appears to be required to reproduce output using Spinach referenced from the docs.

@George3d6
Copy link
Contributor Author

I see the PR but I'm unsure what the target is, and ideally the .csvs should be consolidate into a single one.

Could you consolidate them or would the dataset not make sense then? And if the answer is yes, could you write an info.py file (similar to other datasets) describing the target and some appropriate accuracy function.

@MichaelLantz
Copy link
Contributor

MichaelLantz commented Oct 28, 2021

Based on what I've seen in the research paper it's describing two separate examples and corresponding data sources (DTA files run through the DEERAnalysis Matlab code and .DAT files from DEERNet / Spinach code). I think each referenced graph would have it's own CSV unless I'm way off base.

My Matlab installation and protoyping output was a success. The Spinach code related to the examples in data_paper_kuprov has everything necessary to output the graphs by default, just needed extra code to output the structure to a file.

I've been able to get the output structure of the DEERNet Samples 1 through 6 (described in research paper) into XML. I've been protoyping / exploring the data in Power Query and need some work to split the results into easily consumable result(s). Once I have something more usable I'll create those CSV's and circle back to to the DEERAnalysis examples to see if the official Matlab implementation is gets better results than the ComparativeDeerAnalyzer I used before trying a full Matlab installation.

@MichaelLantz
Copy link
Contributor

MichaelLantz commented Oct 31, 2021

I have two separate pull requests due to the differing methods used to generate results based on the two sets of libraries / examples (Spinach and DeerAnalysis)

@George3d6
Copy link
Contributor Author

seems almost done, cheers :D

@MichaelLantz
Copy link
Contributor

MichaelLantz commented Nov 2, 2021

To quote the great McConaughey “all right, all right, all right” 🥇. I just hope it's a helpful benchmark and my google-fu helped provide a quality product. Apologies for including and additional .zip in each change-set. I'm aware that behavior deviates from other examples.

Considering this was my first venture into the benchmark project I figured it would be helpful for a reviewer to have all the tools / data (minus Matlab and physics libraries) to make sense of how I got to the final CSV's. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers hacktoberfest Contirubte to Lightwood and participate in Hacktoberfest. test Adding or modifying some tests or testing-methodology
Projects
None yet
4 participants