Skip to content
This repository has been archived by the owner on May 30, 2021. It is now read-only.

deal with BibTeX information #87

Closed
sotetsuk opened this issue May 8, 2016 · 0 comments
Closed

deal with BibTeX information #87

sotetsuk opened this issue May 8, 2016 · 0 comments
Assignees
Labels

Comments

@sotetsuk
Copy link
Owner

sotetsuk commented May 8, 2016

WHY

To acquire the author information, we must get the BibTeX information.

How

1. Naive solution

See release v0.0.1-alpha

func (a *Article) crawlAndParseBibTeX() {
    popURL, err := CitePopUpQuery(a.InfoId)
    if err != nil {
        log.Fatal(err)
    }
    popDoc, err := goquery.NewDocument(popURL)
    if err != nil {
        log.Fatal(err)
    }
    bibURL, _ := popDoc.Find("#gs_citi > a:first-child").Attr("href")
    bibDoc, err := goquery.NewDocument(SCHOLAR_URL + bibURL)
    if err != nil {
        log.Fatal(err)
    }
    a.Bibtex = bibDoc.Text()
}

2. scholar.py's solution

  1. send request to GET_SETTINGS_URL of scholar.py#L939
    2016-05-15 21 02 55
  2. send request to SET_SETTINGS_URL of scholar.py#L969
  3. Import into BibTeX emerges. scholar.py#L457 scholar.py#L994 2016-05-15 21 03 10

3. hildensia/scholar.py's solution

  • scholar.py#L201
  • Using "Import from BibTeX" by sending request with this header (?):
headers = {
    'User-Agent': self.UA,
    'Cookie': 'GSP=ID=%(ID)s:CF=%(CF)d' % {
         "ID": self.GID,
         "CF": self.cite_format
    }
}

4. gscholar's solution

Access directly to

https://scholar.google.com/scholar.bib?q=info:0qfs6zbVakoJ:scholar.google.com/&output=citation

See: https://github.com/5kg/gscholar/blob/master/lib/gscholar/paper.rb

This solution fails:

2016-05-15 19 34 08

## Related - [scholar.py](https://github.com/ckreibich/scholar.py) - [gscholar](https://libraries.io/rubygems/gscholar)
@sotetsuk sotetsuk self-assigned this May 8, 2016
@sotetsuk sotetsuk mentioned this issue May 14, 2016
6 tasks
sotetsuk added a commit that referenced this issue Jun 4, 2016
sotetsuk added a commit that referenced this issue Jun 4, 2016
use Cookie for effective bibtex extraction #87
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant