Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:



In [1]:
NAME = "Ziang Liu"



---



Project
=======

The final project is to create a small Python package for OpenAlex based on what we have learned so far. You will create the package and host it on GitHUB. You will turn in a pdf of this notebook.

Your tasks are:
1. Create a pip installable Python package in a GitHUB repo that provides an OpenAlex Works class. The class should have methods to get an RIS and a bibtex entry for a DOI. You can reuse code from previous assignments and lectures. Your class should also have a command line utility that prints RIS or bibtex to the terminal. 
2. Your package must have some tests that show at least some part of the package works correctly.
3. You should make sure your repo passes black and pylint. Your code should pass both of these.
4. You should setup a GitHUB action that runs your tests
5. You should add an Actions status badge that shows in the README.
6. Your package should also have a license.

Put the URL to your repo in the next cell:



In [2]:
%%bash
# Clean up any existing files
rm -rf pkg
pip uninstall -y s23openalex

Found existing installation: s23openalex 0.0.1
Uninstalling s23openalex-0.0.1:
  Successfully uninstalled s23openalex-0.0.1


In [3]:
mkdir -p pkg/s23openalex


In [4]:
%%writefile pkg/setup.py
"""This file help to setup."""

from setuptools import setup

setup(
    name="s23openalex",
    version="0.0.1",
    description="bibtex and RIS",
    maintainer="Ziang Liu",
    maintainer_email="ziangliu@andrew.cmu.edu",
    license="MIT",
    packages=["s23openalex"],
    scripts=[],
    long_description="""get an RIS and a bibtex entry for a DOI""",
)

Writing pkg/setup.py


In [5]:
%%writefile pkg/s23openalex/works.py
"""This file could get an RIS and a bibtex entry for a DOI."""

import requests
import bibtexparser


class Works:
    """This class could get an RIS and a bibtex entry for a DOI."""

    def __init__(self, oaid):
        """Get an RIS and a bibtex entry for a DOI."""
        self.oaid = oaid
        self.req = requests.get(f"https://api.openalex.org/works/{oaid}")
        self.data = self.req.json()

    #         # get bibtex
    #         h = "application/x-bibtex"
    #         res = requests.get(self.data["doi"], headers={"Accept": h})
    #         db = bibtexparser.loads(res.text)
    #         self.bibtex = db.entries[0]

    #         # get RIS

    #         fields = []
    #         if self.data["type"] == "journal-article":
    #             fields += ["TY  - JOUR"]
    #         else:
    #             raise Exception("Unsupported type {self.data['type']}")

    #         for author in self.data["authorships"]:
    #             fields += [f'AU  - {author["author"]["display_name"]}']

    #         fields += [f'PY  - {self.data["publication_year"]}']
    #         fields += [f'TI  - {self.data["title"]}']
    #         fields += [f'JO  - {self.data["host_venue"]["display_name"]}']
    #         fields += [f'VL  - {self.data["biblio"]["volume"]}']

    #         if self.data["biblio"]["issue"]:
    #             fields += [f'IS  - {self.data["biblio"]["issue"]}']

    #         fields += [f'SP  - {self.data["biblio"]["first_page"]}']
    #         fields += [f'EP  - {self.data["biblio"]["last_page"]}']
    #         fields += [f'DO  - {self.data["doi"]}']
    #         fields += ["ER  -"]

    #         self.ris = fields
    def get_bibtex(self):
        """Get a bibtex entry for a DOI."""
        h = "application/x-bibtex"
        res = requests.get(self.data["doi"], headers={"Accept": h})
        db = bibtexparser.loads(res.text)
        self.bibtex = db.entries[0]
        return self.bibtex

    def get_RIS(self):
        """Get an RIS for a DOI."""
        fields = []
        if self.data["type"] == "journal-article":
            fields += ["TY  - JOUR"]
        else:
            raise Exception("Unsupported type {self.data['type']}")

        for author in self.data["authorships"]:
            fields += [f'AU  - {author["author"]["display_name"]}']

        fields += [f'PY  - {self.data["publication_year"]}']
        fields += [f'TI  - {self.data["title"]}']
        fields += [f'JO  - {self.data["host_venue"]["display_name"]}']
        fields += [f'VL  - {self.data["biblio"]["volume"]}']

        if self.data["biblio"]["issue"]:
            fields += [f'IS  - {self.data["biblio"]["issue"]}']

        fields += [f'SP  - {self.data["biblio"]["first_page"]}']
        fields += [f'EP  - {self.data["biblio"]["last_page"]}']
        fields += [f'DO  - {self.data["doi"]}']
        fields += ["ER  -"]

        self.ris = fields
        return self.ris
        

Writing pkg/s23openalex/works.py


In [6]:
%%writefile pkg/s23openalex/__init__.py
"""This file start the pkg."""

from .works import Works

Writing pkg/s23openalex/__init__.py


In [7]:
%%writefile pkg/s23openalex/test_works.py
"""This file test the works."""

import pytest
from s23openalex import Works

bib = {'journal': '{ACS} Catalysis',
 'title': 'Examples of Effective Data Sharing in Scientific Publishing',
 'author': 'John R. Kitchin',
 'pages': '3894--3899',
 'number': '6',
 'volume': '5',
 'publisher': 'American Chemical Society ({ACS})',
 'month': 'may',
 'year': '2015',
 'url': 'https://doi.org/10.1021%2Facscatal.5b00538',
 'doi': '10.1021/acscatal.5b00538',
 'ENTRYTYPE': 'article',
 'ID': 'Kitchin_2015'}

@pytest.fixture()
def setup():
    return bib
    
class TestSort:
    def test_sort(self):
        w
        entries = sort_bibtex('test.bib')
        assert [e['year'] for e in entries] == ['2015', '2018']  

Writing pkg/s23openalex/test_works.py


In [8]:
! black pkg

--- pkg/s23openalex/test_works.py	2023-04-28 22:59:15.332571 +0000
+++ pkg/s23openalex/test_works.py	2023-04-28 22:59:15.506977 +0000
@@ -1,28 +1,32 @@
 """This file test the works."""
 
 import pytest
 from s23openalex import Works
 
-bib = {'journal': '{ACS} Catalysis',
- 'title': 'Examples of Effective Data Sharing in Scientific Publishing',
- 'author': 'John R. Kitchin',
- 'pages': '3894--3899',
- 'number': '6',
- 'volume': '5',
- 'publisher': 'American Chemical Society ({ACS})',
- 'month': 'may',
- 'year': '2015',
- 'url': 'https://doi.org/10.1021%2Facscatal.5b00538',
- 'doi': '10.1021/acscatal.5b00538',
- 'ENTRYTYPE': 'article',
- 'ID': 'Kitchin_2015'}
+bib = {
+    "journal": "{ACS} Catalysis",
+    "title": "Examples of Effective Data Sharing in Scientific Publishing",
+    "author": "John R. Kitchin",
+    "pages": "3894--3899",
+    "number": "6",
+    "volume": "5",
+    "publisher": "American Chemical Society ({ACS})",
+    "month": "may",
+    "year": "2015",
+    "url": "

In [9]:
# ! black pkg

In [19]:
! cat pkg/s23openalex/works.py

"""This file could get an RIS and a bibtex entry for a DOI."""

import requests
import bibtexparser


class Works:
    """This class could get an RIS and a bibtex entry for a DOI."""

    def __init__(self, oaid):
        """Get an RIS and a bibtex entry for a DOI."""
        self.oaid = oaid
        self.req = requests.get(f"https://api.openalex.org/works/{oaid}")
        self.data = self.req.json()

    #         # get bibtex
    #         h = "application/x-bibtex"
    #         res = requests.get(self.data["doi"], headers={"Accept": h})
    #         db = bibtexparser.loads(res.text)
    #         self.bibtex = db.entries[0]

    #         # get RIS

    #         fields = []
    #         if self.data["type"] == "journal-article":
    #             fields += ["TY  - JOUR"]
    #         else:
    #             raise Exception("Unsupported type {self.data['type']}")

    #         for author in self.data["authorships"]:
    #             fields += [f'AU  - {author["author"]["displa

In [11]:
# ! black pkg && flake8 --exclude package/build pkg && pylint --ignore build pkg && pytest pkg
! black pkg && flake8 pkg && pylint pkg && pytest pkg


[1mreformatted pkg/s23openalex/test_works.py[0m
[1mreformatted pkg/s23openalex/works.py[0m

[1mAll done! ✨ 🍰 ✨[0m
[34m[1m2 files [0m[1mreformatted[0m, [34m2 files [0mleft unchanged.
pkg/s23openalex/test_works.py:4:1: F401 's23openalex.Works' imported but unused
pkg/s23openalex/test_works.py:24:1: D103 Missing docstring in public function
pkg/s23openalex/test_works.py:28:1: D101 Missing docstring in public class
pkg/s23openalex/test_works.py:29:1: D102 Missing docstring in public method
pkg/s23openalex/test_works.py:30:9: F821 undefined name 'w'
pkg/s23openalex/test_works.py:31:19: F821 undefined name 'sort_bibtex'
pkg/s23openalex/__init__.py:3:1: F401 '.works.Works' imported but unused


%%writefile pkg/s23openalex/__init__.py
from .works import Works



# Clone the repo here

You should clone your repo in this folder. Use the tree command to show your repo structure:

    ! tree your-repo-name



In [12]:
# This should install the s23openalex package
!pip install ./pkg

Defaulting to user installation because normal site-packages is not writeable
Processing ./pkg
  Preparing metadata (setup.py) ... [?25ldone
[?25hBuilding wheels for collected packages: s23openalex
  Building wheel for s23openalex (setup.py) ... [?25ldone
[?25h  Created wheel for s23openalex: filename=s23openalex-0.0.1-py3-none-any.whl size=2802 sha256=611dd1dee4bec7273d3fe02a109d579ddd4fe9dd22aad709fc3704291a2843f6
  Stored in directory: /tmp/pip-ephem-wheel-cache-v2p78z_6/wheels/a0/63/fe/330c167faff380d6feafcb5aed7af2fb1e123aeb5a729d6373
Successfully built s23openalex
Installing collected packages: s23openalex
Successfully installed s23openalex-0.0.1


In [13]:
! tree pkg


[01;34mpkg[00m
├── [01;34mbuild[00m
│   ├── [01;34mbdist.linux-x86_64[00m
│   └── [01;34mlib[00m
│       └── [01;34ms23openalex[00m
│           ├── __init__.py
│           ├── test_works.py
│           └── works.py
├── [01;34ms23openalex[00m
│   ├── __init__.py
│   ├── test_works.py
│   └── works.py
├── [01;34ms23openalex.egg-info[00m
│   ├── dependency_links.txt
│   ├── PKG-INFO
│   ├── SOURCES.txt
│   └── top_level.txt
└── setup.py

6 directories, 11 files


# Show evidence that your repo passes black and pylint



In [14]:
!pylint --version

pylint 2.14.5
astroid 2.11.7
Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46) 
[GCC 9.4.0]
[0m

In [15]:
%%bash
# black pkg
pylint pkg

# ! black pkg &&  pylint pkg


************* Module pkg
pkg/__init__.py:1:0: F0010: error while code parsing: Unable to load file pkg/__init__.py:
[Errno 2] No such file or directory: 'pkg/__init__.py' (parse-error)


CalledProcessError: Command 'b'# black pkg\npylint pkg\n\n# ! black pkg &&  pylint pkg\n'' returned non-zero exit status 1.

# Tests

Create one or more tests in the repo that show your package works correctly. Show an example here that your tests work.



In [16]:
! pytest pkg



platform linux -- Python 3.9.7, pytest-7.2.2, pluggy-1.0.0
rootdir: /home/jupyter-ziangliu@andrew.cm-5e81f/s23-06682/assignments/project/pkg
plugins: typeguard-2.13.3, anyio-3.6.1
collected 1 item                                                               [0m

pkg/s23openalex/test_works.py [31mF[0m[31m                                          [100%][0m

[31m[1m______________________________ TestSort.test_sort ______________________________[0m

self = <s23openalex.test_works.TestSort object at 0x7fb03681c700>

    [94mdef[39;49;00m [92mtest_sort[39;49;00m([96mself[39;49;00m):
>       w
[1m[31mE       NameError: name 'w' is not defined[0m

[1m[31mpkg/s23openalex/test_works.py[0m:30: NameError
[31mFAILED[0m pkg/s23openalex/test_works.py::[1mTestSort::test_sort[0m - NameError: name 'w' is not defined


# Make some examples of your package to show it works here

Install the package, and show an example for each method (RIS, and bibtex). Provide some evidence that the examples work correctly and generate valid RIS and bibtex.



In [17]:
import s23openalex
from s23openalex import Works
w = Works('https://doi.org/10.1021/acscatal.5b00538')
w.get_bibtex()




{'journal': '{ACS} Catalysis',
 'title': 'Examples of Effective Data Sharing in Scientific Publishing',
 'author': 'John R. Kitchin',
 'pages': '3894--3899',
 'number': '6',
 'volume': '5',
 'publisher': 'American Chemical Society ({ACS})',
 'month': 'may',
 'year': '2015',
 'url': 'https://doi.org/10.1021%2Facscatal.5b00538',
 'doi': '10.1021/acscatal.5b00538',
 'ENTRYTYPE': 'article',
 'ID': 'Kitchin_2015'}

In [18]:
w.get_RIS()

# Show that the commandline utility works. 

Run the command you created and show that it outputs either RIS or bibtex for a DOI.



In [None]:
%%bash

<your command> --some-flag DOI  # depending on flag outputs bibtex or RIS



In [None]:
# Run this cell to generate a pdf from this notebook
# Click the generated links to preview and download it.
# Report errors to Professor Kitchin
from s23 import pdf
%pdf

