# Project

The project should provide some useful functionality in science or engineering. It could be a command line utility, or a package to use in notebooks. There should be some substance, but it does not need to be extensive. I don't expect it should take more than a few hours to write the code. It is not necessary to write very sophisticated code. Overall the project should demonstrate you have learned something in this class.

1. The project must be pip installable
2. Your project should utilize git, and there should be a version history.
3. Your project should have some tests.
4. Your project should use at least one code quality tool.
5. Your project should have a readme.md and LICENSE file.
6. The code should be well documented.
7. The code should be original work.
8. You should push it to a GitHUB repo.

At the end of the mini you will give a live demonstration of your project and what it does.

 


Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:



In [1]:
NAME = "Dalen Hsiao"
COLLABORATORS = ""



## Project demo
In these cells show how your project is installed, and how it is used. If it is a CLI use %%bash cells to illustrate it. Or import the library and show what it does.

In [2]:
%%bash 
cd Project
pip install . 


bash: line 1: cd: Project: No such file or directory


Processing /Users/tungyuhsiao/Documents/CMU_MS/MS_Spring_2024/06643_swe/Project
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: PskLOL
  Building wheel for PskLOL (setup.py): started
  Building wheel for PskLOL (setup.py): finished with status 'done'
  Created wheel for PskLOL: filename=PskLOL-0.0.1-py3-none-any.whl size=5733 sha256=33a73b83378214780a14bda1700516bc3d4ba8790b31df8304b65d3849f3cb04
  Stored in directory: /private/var/folders/lw/r764j9fj36s6yrl2s1r719kw0000gn/T/pip-ephem-wheel-cache-c8sc71s7/wheels/df/6f/e5/560259001c5db20c1fc9950c5d16bb2625d5c86c6f368800ba
Successfully built PskLOL
Installing collected packages: PskLOL
  Attempting uninstall: PskLOL
    Found existing installation: PskLOL 0.0.1
    Uninstalling PskLOL-0.0.1:
      Successfully uninstalled PskLOL-0.0.1
Successfully installed PskLOL-0.0.1


In [3]:
import PskLOL as lol 

lol.category_search(['carnegie mellon university'], ["engineering", "computer science", "neural science"])

Entity name:  Carnegie Mellon University
------------------------------------------------------
Selected keywords:  ['computer science', 'engineering', 'neural science']
('Keywords', 'neural science', 'is not in this entity')
All categories in the selected entity:  ['computer science', 'mathematics', 'physics', 'engineering', 'biology', 'quantum mechanics', 'artificial intelligence', 'programming language', 'chemistry', 'philosophy', 'economics', 'operating system', 'psychology', 'materials science', 'statistics', 'political science', 'medicine']
Search for existing keywords: "computer science, engineering" ...
------------------------------------------------------
Results Found: 

Entity: Carnegie Mellon University,  Category: computer science,  Score: 74.7
Entity: Carnegie Mellon University,  Category: engineering,  Score: 48.8
Entity name:  Carnegie Mellon University Qatar
------------------------------------------------------
Selected keywords:  ['computer science', 'engineering', 

## Show the outline of your project

Use the `tree` command to show the structure of your project here.

In [4]:
!tree PskLOL

[1;36mPskLOL[0m
├── __init__.py
├── [1;36m__pycache__[0m
│   ├── __init__.cpython-311.pyc
│   ├── __init__.cpython-39.pyc
│   ├── _updatedb.cpython-311.pyc
│   ├── _updatedb.cpython-39.pyc
│   ├── category_search.cpython-311.pyc
│   ├── category_search.cpython-39.pyc
│   ├── get_paper.cpython-311.pyc
│   └── update.cpython-311.pyc
├── [1;36m_db[0m
│   ├── data.csv
│   └── institute_db.json
├── _updatedb.py
├── category_search.py
├── get_paper.py
└── update.py

3 directories, 15 files


## GitHUB

Put a link to the GitHUB repo here.

[PskLOL: Package for Scientific Keyword search on Local database from OpenAlex Library](https://github.com/dalenhsiao/PskLOL)

## Version control

Describe your version control approach, and show that there is git history using the git log command.

In [5]:
%%bash
cd ./PskLOL/
git log

commit a3c0d914c7e8793bdebb9334dbf0f57eecd66440
Author: Tungyu Hsiao <68240748+dalenhsiao@users.noreply.github.com>
Date:   Mon Apr 15 12:03:50 2024 -0400

    LICENSE

commit 41346227080050e094c625d16b556a4f1abe4d35
Author: dalenhsiao <dalenhsiao0523@gmail.com>
Date:   Mon Apr 15 12:01:53 2024 -0400

    final code

commit bb6777ea8e8c708e431ea93e3d4594aeb260b682
Author: dalenhsiao <dalenhsiao0523@gmail.com>
Date:   Mon Apr 15 11:51:58 2024 -0400

    test cases complete for DB_connector

commit ade5a79675675f5da69935c76388d9ffb0a653cc
Author: dalenhsiao <dalenhsiao0523@gmail.com>
Date:   Mon Apr 15 10:49:13 2024 -0400

    test case for category_search

commit 16d8ce5270bdbde3ea7276ef50384b83f04bc822
Author: dalenhsiao <dalenhsiao0523@gmail.com>
Date:   Mon Apr 15 10:48:23 2024 -0400

    function category_search and _updatedb are available

commit 2179391a7774328010cd5557806f173adc390078
Author: dalenhsiao <dalenhsiao0523@gmail.com>
Date:   Sat Apr 13 22:06:14 2024 -0400

    add ne

## Code quality tool

Describe the code quality tool you used in your project, and show evidence here of how it is implemented and how it is used.

In [6]:
! black PskLOL/ --check
! flake8 PskLOL/

[1mwould reformat /Users/tungyuhsiao/Documents/CMU_MS/MS_Spring_2024/06643_swe/Project/PskLOL/_updatedb.py[0m
[1mwould reformat /Users/tungyuhsiao/Documents/CMU_MS/MS_Spring_2024/06643_swe/Project/PskLOL/category_search.py[0m
[1mwould reformat /Users/tungyuhsiao/Documents/CMU_MS/MS_Spring_2024/06643_swe/Project/PskLOL/get_paper.py[0m

[1mOh no! 💥 💔 💥[0m
[34m[1m3 files [0m[1mwould be reformatted[0m, [34m2 files [0mwould be left unchanged.
[1mPskLOL/__init__.py[m[36m:[m1[36m:[m1[36m:[m [1m[31mF401[m '.category_search.category_search' imported but unused


## Tests

Describe the tests you built into your project and how they help ensure the project works, and that changes don't break functionality. In the cells below show how you run the tests, and that they work.

In [7]:
import PskLOL as lol
import pandas as pd
from PskLOL._updatedb import DB_connector


def test_fetch_data():
    result = lol._updatedb.fetch_data(
        "carnegie mellon university"
    )
    assert isinstance(result, pd.DataFrame)


def test_fetch_local_db():
    db = DB_connector('PskLOL/_db/data.csv')
    assert db.read_from_csv() is not {}


def test_update_db():
    db = DB_connector('PskLOL/_db/data.csv')
    assert db.update_data(db.data)


def test_get_data():
    db = DB_connector('PskLOL/_db/data.csv')
    assert isinstance(db.get_data(), pd.DataFrame)

![image.png](attachment:image.png)

When you are done, download a PDF and turn it in on Canvas. Make sure to save your notebook, then run this cell and click on the download link.

In [None]:
%run ~/s24-06643/s24.py
%pdf