Skip to content

bwaklog/xkcd-grab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xkcd-grab v2

xkcd comics fetched using terminal 🥳

xkcd

This is the project i've used to demo fuzzy searching and web scraping in the HSP•PESUECC Project Expo ❤️


Hey 👋 This is a CLI tool utilising API's for retrieving user-requested xkcd comics. Its a relatively small sized project, which is WIP cuz of a lack of data. This project is somewhat of a playground for me to explore different searching and querying techniques.

Due to data limitation, I wanted to make it a goal to make it super easy to find a specific comic based on query. The roadmap of this current is to make a smart cli tool to find the most relevant comic based on a search query.

Table of Contents

  1. Installation and Usage
  2. Cool Stuff
  3. Party Feature
  4. Yet To Come
  5. Requirements - covered in Installation

Installation and Usage:

  • Clone this repository

    git clone https://github.com/bwaklog/xkcd-grab
  • Install requirements Some pre-requisites

    1. Python3+
    2. tesseract OCR engine that is going to be implemented in the future updates
    ./install.sh
  • Add xkcd alias to the path for easier commands Add alias to the path manually, I still have to figure out how to automate this.

    alias xkcd='./xkcd.sh'

    Sidenote : the script creates a virtual env venv, so you might want to start using it

    . venv/bin/activate

Fuzzy Search Demo

xkcd-grab-demo-fuzzy

Web Scraping Demo

xkcd-grab-demo-google-2


Commands Available

PS even if u mess up the commands, there is a help file to guide you...which I am yet to complete :P

Here is a boiler plate of how the CLI commands must follow

xkcd <type of request> <extra commands>

1. Type of request

type of request flags
latest comic -l, --latest
specific comic number integer
ex: 297
Fuzzy search comic titles -f, --fuzzy
Regular searching something like an SQL syntax -s, --search
Web Scraping search using google's searching algo to find the best result -g, --google

2. Extra Commands

type of request flags
quick look comic
- uses system quicklook on MacOS to display comics
- uses system default app on other platforms to display comics
-q, --ql
🔘 Saving Comics Feature (Currently can be done by saving image opened by quicklook) TBA
🔘 Sharing Comics Feature (Currently can be done by saving image opened by quicklook) TBA

Cool stuff:

  • For MacOS systems, image is opened using the system quicklook. This has been done by utilising the qlmanage command
  • Web Scraping uses googles best matches to find the comic you are searching for. All you need to type is a search query(anything that describes the comic)
    <iframe src="https://i.imgur.com/xCOmCyX.mp4" allow="fullscreen" allowfullscreen="" style="height: 100%; width: 100%; aspect-ratio: 16 / 9;"></iframe>

Yet to come

This project is somewhat of a playground for me to explore different searching algorithms and querying techniques. While this might have a niche target, I want to build this tool into a more robust API client. The roadmap of this current is to make a smart cli tool to find the most relevant comic based on a search query.

The current web scraping function that is built into the app is the goal I am trying to achieve using data from all the 2800+ comics alone. So this is still very much a work in progress

  • Create a web interface using flask...
    May or may not go ahead with this option cuz the main goal was to create a cli tool. But if needed, I take a chance in making one.

    • 💾 Local Storage options for comics
    • ❤️ Creating Bookmark/Liking features
    • 📩 Creating a sharing option. Send your favorite comics to your friends with a few clicks!
    • Umm...A neat interface cuz I don't want get myself using tkinter or some other boring looking tool.
  • There was supposed to be an install.sh script to add the xkcd.sh script to your alias but that didn't seem to work cuz idk how to do that

Tabulated stuff for professionalism 🫡

Feature Progress
🔥 Smart Comic Search 🕺 In progress
💾 Local Storage Option 👍 workaround available
❤️ Liking\Bookmarking option to save comic no and not on local storage 🔘
📩 Sharing feature (undecided) WAP
🤔 Flask generated page TBA

Party Feature:

⚠️ This is very much in devlopment, but here is how you can use the little orca-mini LLM to make the cli expaliln the comic

  1. Install ollama
  2. Install orca-mini's LLM using ollama (about 2.0 GB)
ollama pull orca-mini
# if ur familiar with docker, you know whats going on
# also macos and linux only for now i guess (26th Nov)
  1. Start the server in another temrinal window
ollama serve
  1. Use the flag -e or --explain after fuzzy search, or web scraping for it to start generating after getting the results
xkcd -f -e

Requirements

What i'm using for this program:

  • This isn't really a disclaimer but if you don't have quick-look (MacOS only), that's no problem! But for now all you get is:
    • 🔗 A link to the image of the post. You can open it in your default browser
    • A very very very descriptive info of the post you requested for ツ
  • Yeah, I haven't used this on a windows pc so far, and some of these...most of these commands are UNIX commands so, join the the Force with BASH 🕺
  • running this in a venv for development, so do make sure you install all the requirements from requirements.txt
  • That's it for now...nothing else to force you to install..other than python3🐍
  • ollama and orca-mini for party feature