Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add permuteAll #605

Merged
merged 9 commits into from Feb 19, 2018
Merged

Add permuteAll #605

merged 9 commits into from Feb 19, 2018

Conversation

kingdavidmartins
Copy link
Contributor

@kingdavidmartins kingdavidmartins commented Feb 12, 2018

Description

add utility function permuteAll [updated with commit 96c37ee]

permuteAll

Uses recursion and Array.push() to return all the permutations of the given input in an array.

const permuteAll = (input) => {
  const result = [];
  let inputState = input;

  if (typeof input === 'string') inputState = input.split('');
  if (typeof input === 'number') inputState = (input).toString().split('');

  const permute = (arr, m = []) => {
    (arr.length === 0)
      ? result.push(m)
      : arr.forEach((_, i) => {
          let curr = arr.slice();
          let next = curr.splice(i, 1);
          permute(curr.slice(), m.concat(next));
        });
  };

  permute(inputState);

  return (typeof input === 'string')
            ? result.map(variant => variant.join(''))
            : (typeof input === 'number') 
                ? result.map(variant => parseFloat(variant.join('')))
                : result;
}
permuteAll('sun') // [ 'sun', 'snu', 'usn', 'uns', 'nsu', 'nus' ]
permuteAll([1, 33, 5]) // [ [ 1, 33, 5 ], [ 1, 5, 33 ], [ 33, 1, 5 ], [ 33, 5, 1 ], [ 5, 1, 33 ], [ 5, 33, 1 ] ]
permuteAll(345) // [ 345, 354, 435, 453, 534, 543 ]

What does your PR belong to?

  • Website
  • Snippets
  • General / Things regarding the repository (like CI Integration)
  • Tests

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (non-breaking improvement of a snippet)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • [x My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have checked that the changes are working properly
  • I have checked that there isn't any PR doing the same
  • I have read the CONTRIBUTING document.

@Chalarangelo
Copy link
Owner

Isn't this essentially anagrams but for arrays? I'd rather this only worked for arrays, so that anagrams can keep its use-cases and that this snippet will be shortened a little bit further. Also, I'd like the PR to pass the Codacy Quality Review before we merge it (just a few missing semicolons etc, it's an easy fix anyways).

@Chalarangelo
Copy link
Owner

Chalarangelo commented Feb 12, 2018

Just to clarify:

const permutations = arr => {
  if (arr.length <= 2)
    return arr.length === 2 ? [arr, [arr[1], arr[0]]] : arr;
  return arr.reduce(
    (acc, item, i) =>
      acc.concat(
        permutations([...arr.slice(0, i), ...arr.slice(i + 1)]).map(val => [
          item,
          ...val,
        ])
      ),
    []
  );
};

The above snippet is slightly shorter and works only for arrays, similar to the way anagrams does for strings. By doing this we can safely work with arrays and have a short and easy to read snippet that doesn't handle mixed input.

This can work with all input types provided originally, like so:

permutations([...'sun']).map(s => s.join(''));  // [ 'sun', 'snu', 'usn', 'uns', 'nsu', 'nus' ]
// The above is the same as `anagrams('sun')` so it's not gonna be used all that much anyways
permutations([1, 33, 5]); // [ [ 1, 33, 5 ], [ 1, 5, 33 ], [ 33, 1, 5 ], [ 33, 5, 1 ], [ 5, 1, 33 ], [ 5, 33, 1 ] ]
permutations([...(345).toString()]).map(v => parseInt(v.join(''))); // [ 345, 354, 435, 453, 534, 543 ]

@kingdavidmartins
Copy link
Contributor Author

kingdavidmartins commented Feb 12, 2018

@Chalarangelo Oh No 😞 I don't know why I didn't catch this before. Our snippet anagrams is completely wrong. By definition an anagram is the following

a word, phrase, or name formed by rearranging the letters of another, such as cinema, formed from iceman.

The key point here is that an anagram is && always will be a valid word. It can't never be an invalid word.

Our anagrams snippet should be renamed to isAnagram and time complexity on worst case should never be more than o(n)

A good naive/simple example which can be seen in standfords computer course anagrams example or when you look up examples checking if a word is an anagram is done by typically doing the following

const isAnagram = (origin, copy) => {
  if (origin.length !== copy.length) {
    return false;
  }
  
  const history = {};
  for (let a = 0; a < origin.length; a++) {
    if (history[origin[a]] === undefined) {
      history[origin[a]] = 1;
    } else if (history[origin[a]] !== undefined) {
      history[origin[a]]++;
    }
  }
  
  for (let a = 0; a < copy.length; a++) {
    if (history[copy[a]] !== undefined) {
      history[copy[a]]--;
    }
  }
  
  let finalState = Object.values(history)
  
  for (let a = 0; a < finalState.length; a++) {
    if (finalState[a] !== 0) {
      return false;
    }
  }
  
  return true;
}

isAnagram('iceman', 'cinema'); // true

of course using es6 this can be significantly reduced with a couple reduces/maps

More resources to checkout

@kingdavidmartins
Copy link
Contributor Author

kingdavidmartins commented Feb 13, 2018

So to answer your question for the reason stated above I don't think we should modify to your proposed snippet.

Also anagrams is string specific and permutations is not

That being said we def need to fix and update anagrams

@mariastervic
Copy link
Contributor

Typically, in programming, anagrams refers to any string with the letters of the original one scrambled. I think that anagrams should stay as it is and permutations be made array-specific. It is impossible to create only the valid lexical anagrams without a dictionary, which would be entirely out of the scope of the project as far as I understand it.

@Chalarangelo
Copy link
Owner

Chalarangelo commented Feb 13, 2018

I, too, think that anagrams should not be altered at all, as it is one of the first snippets that we had on the repo and nobody has complained about it not being a lexical anagram generator (which we do not state anywhere in its description). The functionality is clearly defined and useful as-is, as there are many cases where you want non-lexical anagrams in programming (ids, properties etc.). Apart from that, there is also no easy way to fix said snippet, as this would require us to provide a dictionary and run a search algorithm through it, which is out of our scope.

Apart from that, we should definitely not have the same snippet operate on strings and arrays, as the permutation/anagram operation is extremely costly anyways, so adding extra type checking will only make it worse and the snippet unusable. No matter how similar, for the sake of usability, we should separate array and string permutations/anagrams.


As a side note, Wikipedia's definition of anagrams refers to anagrams of a word, which should be words. However, in programming, you cannot ensure that the given string input to a function like anagrams is a word to begin with, therefore generating all possible strings from the input's permutations is technically valid. Take, for example, the example we provide with anagrams (anagrams('abc') => ['abc','acb','bac','bca','cab','cba'])). In that case, the input string is not a word, therefore Wikipedia's definition does not apply.

What I'm trying to say is that programmers' definitions do not always match real-life counterparts. Also, we are sort of nitpicking at this point, while the original discussion was about letting permutations run on any type of input (which I am against both because we have discussed this in previous snippets and decided it's a red flag and also due to performance issues making the snippet less likely to be useful with type checking and casting - and as a bonus benefit: if permutations is array-specific, it can go under the appropriate array tag, making easier to find, as the utility section is more general-use snippets and it doesn't feel right to me to be under that as its primary tag - however with mixed input it doesn't really fit under array either).

@rohitanwar
Copy link
Contributor

rohitanwar commented Feb 13, 2018

@Chalarangelo @mariastervic I think @kingdavidmartins is right and I have also wanted to say anagrams is wrong but I never did so 😞 . His definition of anagrams is completely correct even in computer science terms but I think his code isAnagram can be shortened :-

const isAnagram = (word1,word2)
       var regex = /[^a-z0-9]/gi;
       var str1 = word1.replace(regex, ''),
       var str2 = word2.replace(regex, '');
       return str1.toLowerCase().split('').sort().join('') === str2.toLowerCase().split('').sort().join(''))

Plus his code can be made faster:-

function permute(permutation) {
  var length = permutation.length,
      result = [permutation.slice()],
      c = new Array(length).fill(0),
      i = 1, k, p;

  while (i < length) {
    if (c[i] < i) {
      k = i % 2 && c[i];
      p = permutation[i];
      permutation[i] = permutation[k];
      permutation[k] = p;
      ++c[i];
      i = 1;
      result.push(permutation.slice());
    } else {
      c[i] = 0;
      ++i;
    }
  }
  return result;
}

Or even more better using generators

function* permute(permutation) {
  var length = permutation.length,
      c = Array(length).fill(0),
      i = 1, k, p;

  yield permutation.slice();
  while (i < length) {
    if (c[i] < i) {
      k = i % 2 && c[i];
      p = permutation[i];
      permutation[i] = permutation[k];
      permutation[k] = p;
      ++c[i];
      i = 1;
      yield permutation.slice();
    } else {
      c[i] = 0;
      ++i;
    }
  }
}

@kingdavidmartins
Copy link
Contributor Author

@Chalarangelo sorry but I have to disagree with you on the following

As a side note, Wikipedia's definition of anagrams refers to anagrams of a word, which should be words. However, in programming, you cannot ensure that the given string input to a function like anagrams is a word to begin with, therefore generating all possible strings from the input's permutations is technically valid. Take, for example, the example we provide with anagrams (anagrams('abc') => ['abc','acb','bac','bca','cab','cba'])). In that case, the input string is not a word, therefore Wikipedia's definition does not apply.

This is a typical Interview question when interviewing with companies in the early stage's when companies are just trying to valid your skills as a programmer. You can google anagram and any coding practice site.

Example:
Leetcode anagram
geekforgeek anagram

I can assure you anagrams as is will never pass and in all instances fail.

You will never come across a question regarding anagrams that'll need you need to print or store all permutation. Unless when given 2 strings and asked to check if one is an anagram of the other you create permutations of one and check if the other exist in the list of permutations and return a boolean. Which still returns a boolean and companies wouldn't go forward since that is seen as the most inefficient solutions since sorting chars is o (n log n) which is better & using a hashmap | histogram is even faster o (n)

The assumption in CS is that if you are given 1 valid word can you confirm if the other word you are given is a valid anagram?

Validating words is extremely hard especially since new words & definitions are generated consistently. So that's not what's being asked. When asked if 1 word is a anagram of the other

It's like someone complaining about specific string snippets not supporting a different type of char encoding format.

When string snippets, methods, function are typically created they are done so with certain assumptions. May that working with special chars, Unicode, ASCII, or etc

@skatcat31 Thoughts??

@kingdavidmartins
Copy link
Contributor Author

@kriadmin Agreed! Although not the fastest since it's o (n log n) it gets the job done and isn't too bad

Only thing I would suggest is that you add a simple check to isAnagrams if the lengths don't match to return false so you don't run into o (n log n) time too much only when necessary

const isAnagram = (word1,word2) => {
  if (word1.length !== word2.length) return false;
  const regex = /[^a-z0-9]/gi;
  const str1 = word1.replace(regex, '');
  const str2 = word2.replace(regex, '');
  return str1.toLowerCase().split('').sort().join('') === str2.toLowerCase().split('').sort().join('')
}

isAnagram('iceman', 'cinema'); // true

@kingdavidmartins
Copy link
Contributor Author

Hey 👋 😄 @mariastervic Thank you for joining in on the conversation however I disagree. The assumption is if given valid words. The purpose isn't to validate if the given string is a "word" but rather it's to know if the given 2 words are anagrams. Assuming they are both valid words.

@Chalarangelo
Copy link
Owner

Chalarangelo commented Feb 13, 2018

Ok, to resolve the situation, here's my proposal:

String snippet stringPermutations - categorized under string

The exact one we now have as anagrams, except we change the wording a little bit to match the definition of permutations, instead of anagrams. Functionality stays the same. We could add the example for numbers here, so that it can be easy for people to find how to create the permutations of digits in a number and then use Array.map(x => parseInt(x)) to convert back to numeric outputs:

stringPermutations(''+345).map(x => parseInt(x)); // [ 345, 354, 435, 453, 534, 543 ]

Array snippet permutations (or arrayPermutations) - categorized under array

Something like the one I proposed, so that it works only on arrays. Mixed input should be avoided to make the snippet shorter and easier to understand. In theory, this should work well with objects, too, using Object.entries(), like so:

const obj = { a: 10, b: 20 };
permutations(Object.entries(obj)).map(o =>
  o.reduce((a, v) => ((a[v[0]] = v[1]), a), {})
); // [ { a: 10, b: 20 }, { b: 20, a: 10 } ]

This might not be the best example in the world, but I can see valid use-cases for it.

Anagram checking isAnagram

This is a valid snippet that should be added, as we have nothing of the sort. My opinion is we should code it for maximum brevity and readability in one of the following ways:

// As a single statement (kinda spreads out too much with prettier, but still easy to understand - lots of code duplication
const isAnagram = (word1, word2) =>
  word1.length !== word2.length
    ? false
    : word1
        .replace(/[^a-z0-9]/gi, '')
        .toLowerCase()
        .split('')
        .sort()
        .join('') ===
        word2
          .replace(/[^a-z0-9]/gi, '')
          .toLowerCase()
          .split('')
          .sort()
          .join('');

isAnagram('iceman', 'cinema'); // true

Or alternatively

// Short, convoluted, a bit too code-golfy, kinda slow due to using `.every()`
const isAnagram = (word1, word2) =>
  word1.length !== word2.length
    ? false
    : [word1, word2]
        .map(str =>
          str.replace(/[^a-z0-9]/gi, '').toLowerCase().split('').sort().join('')
        )
        .every((val, i, arr) => val === arr[0]);

Or even:

// Not a single statement, easy to read and understand, no code duplication
const isAnagram = (word1, word2) => {
  if (word1.length !== word2.length) return false;
  const [str1, str2] = [word1, word2].map(str =>
    str.replace(/[^a-z0-9]/gi, '').toLowerCase().split('').sort().join('')
  );
  return str1 === str2;
};

The last one is, in my opinion the easiest one to read, but all three of them seem ok.

The key reason I insist on breaking them up is to make it easier for people to find said snippets. Mixing both string and array in one snippet will mean we have to categorize under utility which significantly reduces visibility and is sub-optimal in my opinion, plus it makes testing harder. We want snippets to do one thing, so we might as well keep them simple, yet similar.

@Chalarangelo
Copy link
Owner

@kriadmin Your suggestion on using a generator is super interesting, however I am not at all used to using them and so might be a lot of other people. I am totally not against using that, but this will still have the discoverability issue, being under utility, which I think is a bit of a problem. Also, I have no clue how this fares performance-wise. I think @skatcat31 might be able to help us out here in general and specifically pertaining to this suggestion.

@rohitanwar
Copy link
Contributor

@Chalarangelo I think the benefit of using of using generators is that they allow to loop through the results without loading all the permutations at once which is what in most cases the user wants to do so.

@Chalarangelo
Copy link
Owner

@kriadmin Looking at your code again and reading up on generators, I don't think this is an option, due to the fact that all snippets have to follow the arrow function syntax which, as far as I can tell, doesn't seem to work with generators. Also, from the use-cases I have bumped into, it is far more common to want the whole set of permutations, instead of retrieving them one by one, so I am not sure how this would fare in a real-life scenario.

@rohitanwar
Copy link
Contributor

@Chalarangelo Why would someone want all the permutations? If you want to do something with all of them you can use for loop. If you still want all you can do [...function(args)]. But yeah you are right about arrow function not supporting generators. But can't we just be a little lenient for this one?

@Chalarangelo
Copy link
Owner

Chalarangelo commented Feb 14, 2018

@kriadmin We actually can't, as this will need pretty much every single build script to be reworked just for one snippet, which we are definitely not doing. Also, you might want all permutations to filter out something (e.g. use in conjunction with a dictionary to get valid words from permutations of a string or keep permuted arrays that satisfy a condition etc) or apply any other Array.prototype method to get some kind of result or even combine with some of our other array methods to apply a transformation to the generated permutation data.

@atomiks
Copy link
Contributor

atomiks commented Feb 14, 2018

Can't you just do smth like:

const isAnagram = (word1, word2) => {
  const sort = str => str.toLowerCase().split('').sort().join('')
  return sort(word1) === sort(word2)
}

Is the regex replace really necessary btw??

@Chalarangelo
Copy link
Owner

@atomiks That's pretty elegant, honestly. The regex replace is needed to make sure that spaces and special characters are ignored, which is kind of nice to have as far as I know (not a huge problem anyways). Also by checking strict equality, I think the length check is not necessary, although by adding it we could make sure the function returns very fast for long strings whose lengths don't match (not a huge issue to be honest, we can avoid that).

@atomiks
Copy link
Contributor

atomiks commented Feb 14, 2018

Regarding the "anagram must be a valid word" thing, you could supply an array of valid words (as mentioned before, outside the scope of the project), but it gives the option like we did with pluralize.

const isAnagram = (word1, word2, validWords) => {
  const sort = str => str.replace(/[^a-z0-9]/gi, '').toLowerCase().split('').sort().join('')
  return (
    sort(word1) === sort(word2) && 
    validWords 
      ? [word1, word2].every(word => validWords.includes(word))
      : true
  )
}

const VALID_WORDS = ['pear', 'reap']
isAnagram('pear', 'prea') // true
isAnagram('pear', 'prea', VALID_WORDS) // false
isAnagram('pear', 'reap', VALID_WORDS) // true

Adding an array as the last arg to check if they're both valid words becomes expensive tho

@Chalarangelo
Copy link
Owner

@atomiks I don't really think we need to add a dictionary, as this function essentially checks if the second string is an anagram of the first. Now that does not mean it has to be a word, as it could be a sentence that is an anagram of another sentence. Therefore, better to leave the dictionary out. If someone wants to do anagram checking and check validity of words, they are better off using some natural language processing library over ours, I think.

@skatcat31
Copy link
Contributor

I'm confused on what way the conversation is going. It started as "whats the purpose" and is now "should we only check a single anagram for performance reasons and to avoid golf?" or am I missing something?

@Chalarangelo
Copy link
Owner

@skatcat31 The conversation started from the main question if we should allow one permutation snippet to work on both strings and arrays. Then, we realized our definition of anagrams is quite wrong and we tried to figure out the best course of action, which spawned the isAnagram snippet and ultimately led to debating the best way to write its code.

As far as I can tell, nobody is against my suggestion above, so, if no objections are made until next morning (about 12 hours), I will proceed to implement it, updating and merging this PR eventually with all the necessary changes to have 2 permutation snippets for each data type and an anagram-checking snippet.

@skatcat31
Copy link
Contributor

Ah then yes I did miss something while reading. Yeah keeping the spinets separate is the way to go due to the nature of data differences:

  • Strings are made up of Chars, of which there is a limited set
  • Arrays are made of up data, of which there is an infinite set

When operating on sets, you always have to decide how far away you want to go, and another big question becomes "What use is an anagram of an array?" since an array can be transformed and made out of many different ways and end up being the same, and the process is more important with an array than the end value

@Chalarangelo Chalarangelo merged commit 542de4b into master Feb 19, 2018
@Chalarangelo Chalarangelo deleted the add-permuteAll branch February 19, 2018 13:55
@kingdavidmartins
Copy link
Contributor Author

Makes Sense! I can see why we are going with the following

@lock
Copy link

lock bot commented Dec 19, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for any follow-up tasks.

@lock lock bot locked as resolved and limited conversation to collaborators Dec 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants