Finding Similar Magic: the Gathering Cards Using TF-IDF

This program finds Magic: the Gathering cards that contain similar text to a given card, using sci-kit learn's TF-IDF module.

TF-IDF stands for "Term frequency * inverse document frequency." It is an algorithm for classifying texts based on weighted word counts, where relatively rare words are weighted more heavily than common ones when they appear in a document. Words that appear often in a card's text but rarely in other cards are treated as the most descriptive keywords for that card, while common words are safely ignored. In this way, we can describe a card's text as a vector representing occurrences of each word and their importance to the text, which can then be easily compared to other vectors to find similar cards.

If the CardSimilarity.ipynb Jupyter Notebook file doesn't load in GitHub, try viewing it here: https://nbviewer.jupyter.org/github/ekohrt/MtgSearchSimilarCards/blob/main/CardSimilarity.ipynb

Thank you for checking out my project!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
AtomicCards_Small.json		AtomicCards_Small.json
CardSimilarity.ipynb		CardSimilarity.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AtomicCards_Small.json

AtomicCards_Small.json

CardSimilarity.ipynb

CardSimilarity.ipynb

README.md

README.md

Repository files navigation

Finding Similar Magic: the Gathering Cards Using TF-IDF

About

Releases

Packages

Languages

ekohrt/search-mtg-cards-similar-text

Folders and files

Latest commit

History

Repository files navigation

Finding Similar Magic: the Gathering Cards Using TF-IDF

About

Topics

Resources

Stars

Watchers

Forks

Languages