Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outputting shortened words of a solution #45

Closed
rbozan opened this issue Jul 8, 2023 · 7 comments
Closed

Outputting shortened words of a solution #45

rbozan opened this issue Jul 8, 2023 · 7 comments
Assignees

Comments

@rbozan
Copy link

rbozan commented Jul 8, 2023

Hi,

I currently have an use case where I'd like to generate a crossword puzzle but where shortened words of solutions which were added to the word list should also be outputted in the key value in the JSON, but sometimes aren't.

For example, let's say I have a wordlist of 5k words and create a 8x8 grid using this tool. One word in the list could be "icecream" and another one could be "cream" among these 5k words. There now seems to be a situation where "icecream" could be a solution (and is outputted) but "cream" isn't, while it was in my word list and it is a valid solution.

Using these arguments and shuffling a large (Turkish) dictionary and passing it to stdin:

word-search -l 2 -s 8 -f JSON -o ....

I'm seeing a solution which is "nemek" while I am expecting "ne" to also be in there, but isn't.

Thanks for the package though

@joshbduncan
Copy link
Owner

Yeah, when generating the puzzle, the api ignores any words that are "sub" words (words that are inside of others) like "cream" in "icecream". Since this was built for generating word search puzzles, you can't have a word that can be found twice in the same puzzle. The generator also skips any words that are palindromes and words with puncuation.

If you want to see what is going on in the code you can checkout the process_input() function inside of utils.py.

def cleanup_input(words: str, secret: bool = False) -> Wordlist:
    """Cleanup provided input string. Removing spaces
    one-letter words, and words with punctuation."""
    if not isinstance(words, str):
        raise TypeError(
            "Words must be a string separated by spaces, commas, or new lines"
        )
    # remove new lines
    words = words.replace("\n", ",")
    # remove excess spaces and commas
    word_list = ",".join(words.split(" ")).split(",")
    # iterate through all words and pick first set that match criteria
    word_set: Wordlist = set()
    while word_list and len(word_set) <= config.max_puzzle_words:
        word = word_list.pop(0)
        if (
            len(word) > 1
            and not contains_punctuation(word)
            and not is_palindrome(word)
            and not word_contains_word(word_set, word.upper())
        ):
            word_set.add(Word(word, secret=secret))
    # if no words were left raise exception
    if not word_set:
        raise ValueError("Use words longer than one-character and without punctuation.")
    return word_set


def word_contains_word(words: Wordlist, word: str) -> bool:
    """Make sure `test_word` cannot be found in any word
    in `words`, going forward or backward."""
    for test_word in words:
        if (
            word in test_word.text.upper()
            or word[::-1] in test_word.text.upper()
            or test_word.text.upper() in word
            or test_word.text.upper()[::-1] in word
        ):
            return True
    return False

You could certainly alter this functionality and just remove the and not word_contains_word(word_set, word.upper()) line from your local copy of the code. That would allow "icecream" and "cream" to exist in the same puzzle.

Le tme know if this doesn't make sense? Cheers!

@joshbduncan joshbduncan self-assigned this Jul 10, 2023
@joshbduncan
Copy link
Owner

Hey, I've changed to API a bit to allow users to specify their own set of Validators (or none at all) when generating a puzzle. I haven't released a new version just yet but I have it all working.

By default, everything will work as before as a base set of validators are applied to every WordSearch object unless otherwise specified.

DEFAULT_VALIDATORS = [
    NoPalindromes(),
    NoPunctuation(),
    NoSingleLetterWords(),
    NoSubwords(),
]

If you are using the Python API you can specify validators during creation (or after the fact). Say you only want the NoPunctuation() and NoSingleLetterWords() validators...

from word_search_generator import WordSearch
from word_search_generator.word.validation import NoPunctuation, NoSingleLetterWords

ws = WordSearch("icecream cream cat bat rat", validators=[NoPunctuation(), NoSingleLetterWords()])

If you are using the CLI you could pass the --no-validators flag to disable the default set of validators listed above...

$ word-search icecream cream cat bat rat -l 2 -s 8 -c --no-validators
---------------
  WORD SEARCH
---------------
U Q H H R N F N
I C E C R E A M
I O T A F A R E
M P V T E P A Z
B H U D Q W T Z
A I C S C N O L
T C R E A M W N
V T Y C G Y Q W

Find these words: BAT, CAT, CREAM, ICECREAM, RAT
* Words can go S, SE, NE, and, E.

Answer Key: BAT S @ (1, 5), CAT S @ (4, 2), CREAM E @ (2, 7), ICECREAM E @ (1, 2), RAT S @ (7, 3)

You can even write your own validators if need be...

from word_search_generator.word.validation import Validator

class CustomValidator(Validator):
    """A validator to ensure the value is awesome."""

    def validate(self, value: str, *args, **kwargs) -> bool:
        return "awesome" in value.lower()

@joshbduncan
Copy link
Owner

@duck57, you may be interested in this custom validator concept for your crossword puzzles (see above and latest merges to main)...

@rbozan
Copy link
Author

rbozan commented Jul 11, 2023

Hey Josh that's awesome. I do want to say something in case of my usecase (based on what you posted):

In that example output you sent you can see that:

 CREAM E @ (2, 7), ICECREAM E @ (1, 2), 

However, for my use case I do not necessarily need "cream" and "icecream" to be on two different positions, so they don't have to appear twice. It could just show "icecream" but also have "cream" as an allowed answer. So for example, it could be:

$ word-search icecream cream cat bat rat -l 2 -s 8 -c --no-validators
---------------
  WORD SEARCH
---------------
U Q H H R N F N
I C E C R E A M
I O T A F A R E
M P V T E P A Z
B H U D Q W T Z
A I C S C N O L
T K R E A P W N
V T Y C G Y Q W

Find these words: BAT, CAT, CREAM, ICECREAM, RAT
* Words can go S, SE, NE, and, E.

Answer Key: BAT S @ (1, 5), CAT S @ (4, 2), CREAM E @ (4, 7), ICECREAM E @ (1, 2), RAT S @ (7, 3)

I hope that makes sense.

@joshbduncan
Copy link
Owner

Yeah, that makes sense but wouldn't be easy using this API. The puzzle generator had some pretty specific rules for placing words that follow the basics of a word-search puzzle. I could see abstracting the generator out as I did with the validators, but then that would require users (or you) to write their own generators from an abstract base class. This may be something I can play with next week if I can find time. I'll let you know. Cheers!

@elcarim
Copy link

elcarim commented Sep 11, 2023

where do you get the words? Are there any themed sets? my son loves animals!
Thank you for your wonderful program, it’s clear that it was made with soul!

@joshbduncan
Copy link
Owner

@elcarim, the words in the package are just a selection from the English dictionary. I don't have any themed sets but that's a great idea I may work on adding... In the meantime, you can pretty easily use your own words from a file and just pipe them into the script via the CLI. Cheers!

$ cat your_custom_word_list.txt | word-search

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants