-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usernames are not random enough -- getting collisions in my specs #19
Comments
Hello, The reason you get duplicates is because ffaker works out of random indexes to arrays of data. Since we want the names to be random and not always generate the same data each time you start getting names, we use Kernel's rand method to get the index to the arrays, and so sometimes you might get duplicated values. One easy 'fix' for this is to create a helper method to store names already generated. I've using sham.rb from an old version of the machinist gem to this purpose: https://gist.github.com/1008143 require 'sham'
require 'ffaker'
Sham.define do
email { Faker::Internet.email }
end
1.upto(10) do
puts Sham.email
end I'm thinking it might be useful to have Sham inside ffaker, wdyt? |
+1 for a built-in Sham inside ffaker |
+1 for having it built-in, I agree. I notice that Machinist 1 requires that the developer uses Sham, while Machinist 2 removes that requirement. I wonder how they get around it. I'm using DataMapper so I'm using Machinist 1, without Sham... and now I understand why Sham is needed, thanks ;) You could probably just have a |
By the way, I think if this was to be built-into FFaker, it has to be optional for the obvious performance hit just to answer a case that's probably only meaningful 5% of the time. Maybe this can be done with something like:
|
I don't think (f)faker are responsible for creating unique values. If you need to create unique username in tests, you can do it easily with adding sequence (e.g. FactoryGirl sequence) Generating not enough random values is quite different issue - that can be a problem, but I don't think your description is that case. |
@chulkilee i don't think using sequence with names makes any sense, meaningful names is the reason anyone started using ffaker in the first place. Why not just maintain a list of used names to keep up with the uniqueness constraints? |
I do think username with sequence is meaningful enough for testing. It's not "meaningless" username as much as "user1", "user2". In the real world people are experiencing collision of usernames, aren't they? And websites suggest adding random unique number (not sequential) at the end in that case (e.g. google) I don't think adding sham in ffaker is a good idea. Generally it's good to have the single responsibility for a library. For example I'm using factory_girl (which can be used in the place of sham). |
Sadly I don't use ffaker myself very much these days and putting some time aside to work on it has not been the first thing in my mind for a while. But I always have in the back burner the idea of removing randomness from ffaker and given each unique ffaker permutation a unique sequence number. I think this would be a great feature. The API would look something like: def (Faker::Name).name(permutation_id = random(max_name_permutations))
blah...
end Rewriting each generator to accept a permutation id would be a lot of effort... Once done, if you needed unique names you would be able to pass an incrementing number to each method: Faker::NameXX.fullname(1) # Tony Kamo
Faker::NameXX.fullname(2) # Anita Lahuerfanita
Faker::NameXX.fullname(3) # Carozo Ynarizota
Faker::NameXX.fullname(1) # Tony Kamo
# etc... A checksum to int mod max_permutation_number could be used to map names to names in funny ways Faker::NameUS.fullname("Emmanuel Oga".hash % NameUS::FULLNAME_COUNT) # John Doe 😄 One MAJOR advantage is that the test suite would be a lot cleaner and less flaky! |
@EmmanuelOga would be really useful! Just stumbled upon this "buggy" behaviour as well... |
This looks like an outdated issue, FFaker now supports unique results e.g. |
This happens very sporadically, but in tests I'm using ffaker to generate random usernames. I was previously just using
ActiveSupport::SecureRandom.hex(16)
, but a colleague convinced me that ffaker was the way to go, so I switched and have had much success, besides this single issue.In specs where I needs to create two or more users, sometimes, but not always (read, mostly not), ffaker generates the same username twice, which causes my specs to fail due to duplicate key conflicts in the database. Re-running the test suite always corrects the issue. My database is correctly cleaned between examples... this issue is localized to within a single example/test.
I think a possible solution (and maybe it actually exists but I don't see it), is for ffaker to strike out already used usernames, so that the same one cannot be generated twice, then provide some method to reset ffaker's dictionary so that all struck-out entries are released for re-use. This way one can just reset ffaker in their setup/teardown logic.
The text was updated successfully, but these errors were encountered: