Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding new simulation engines #21

Open
ltalirz opened this issue Jan 16, 2021 · 3 comments
Open

Adding new simulation engines #21

ltalirz opened this issue Jan 16, 2021 · 3 comments
Labels
help wanted Extra attention is needed

Comments

@ltalirz
Copy link
Owner

ltalirz commented Jan 16, 2021

This issue tracks information regarding the addition of new simulation engines.

Before suggesting a new engine, please make sure that

  1. it fits the scope of this list
  2. it has had at least one year with 100 citations or more.

Citations are queried on Google Scholar, with typical search terms being the name of a code + the name of a key author (e.g.: VASP Kresse).

There is an actively maintained watchlist of codes that do not yet meet the relevance criterion.

@ltalirz ltalirz pinned this issue Jan 16, 2021
@ltalirz ltalirz added the enhancement New feature or request label Jan 17, 2021
@ltalirz ltalirz added help wanted Extra attention is needed and removed enhancement New feature or request labels Apr 10, 2021
@sponce24
Copy link
Contributor

Hello,

Very nice Leopold. Maybe EPW could be added (Google scholar 'EPW Giustino' returns 149 citations for 2020).

Thanks,
Samuel

@ltalirz
Copy link
Owner Author

ltalirz commented Apr 15, 2021

Thanks, Samuel! As you are one of the code authors, would you mind opening a PR to add EPW to https://github.com/ltalirz/atomistic-software/blob/master/src/data/codes.json? I think it would fit into the S=Spectroscopy category (that, by the way, is the least populated and least well-defined; probably misses out on a couple of other codes).
Or let me know which of the "tags" should apply in your view.

No need to add the citation numbers; I'll take care of that.

@ltalirz ltalirz changed the title Adding new simulation engines to the list Adding new simulation engines (+ watchlist) Jun 28, 2021
@ltalirz ltalirz changed the title Adding new simulation engines (+ watchlist) Adding new simulation engines Aug 26, 2021
@ltalirz
Copy link
Owner Author

ltalirz commented Sep 23, 2021

Below a point raised by @jeffhammond via email + a follow-up

Your citation counter uses https://scholar.google.com/scholar?q=NWChem%20Valiev&hl=en&as_sdt=0%2C5&as_ylo=2020&as_yhi=2020, which maps to "NWChem Valiev". Is there any particular reason why you only count NWChem citations associated with Marat Valiev? Are there false positives otherwise?

If you use "NWChem" alone, you see https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&as_ylo=2020&as_yhi=2020&q=NWChem&btnG= has 634 citations instead of 353. This is a nontrivial difference. I would be really surprised if any of these is not associated with the NWChem of interest.

If you want to be explicit about authors, "NWChem Windus OR de-Jong OR Kowalski OR Bylaska OR Apra OR Valiev OR Govind OR Harrison OR Straatsma" will get you 426 citations, with relatively high specificity.

The "code name" + "key author" approach is somewhat of a historical relict from the list created by Luca Ghiringhelli but there also is some thought behind it. While the "code name"-only approach would work for certain engines with highly unique names (NWchem is one of them), it would not work for other codes (think: ORCA, etc.).
Including author names in the queries for some codes but not for others would create a relative imbalance in the data set, e.g. in cases where the name of a code is mentioned in the text without a citation / author, or in cases where Google scholar hasn't fully indexed the references.

I think a first improvement over the "code name" + "key author" is indeed to include at least the first authors of recent review articles via OR, which I've now done for nwchem. However, I'm also open to discussing dropping the author from the query altogether, as well as any other suggestions for improving the general query approach.

P.S. One could argue that maintaining a list of review papers per code and simply summing up the corresponding citations would be superior to the current approach (even if then one will be double-counting some citations). The main drawbacks of this approach are that it significantly increases the maintenance burden, and that it would no longer be possible to direct users to the results of the query used to obtain the citation results (since multiple queries would be needed). If Google Scholar’s API ever adds support for combining multiple “cited by” queries into one, I will consider switching.

@ltalirz ltalirz mentioned this issue Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants