Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential generators #11

Closed
noamross opened this issue Jan 23, 2017 · 8 comments
Closed

Potential generators #11

noamross opened this issue Jan 23, 2017 · 8 comments
Labels

Comments

@noamross
Copy link

Some potential generator ideas for scientific users. Many being convenience wrappers.

Geographic:

  • lat/long in bounds
  • place names generated by selecting location names a given type closest to lat/lon coordinates
  • Countries, U.S. states, etc.

Numeric

  • Value in range for integers and doubles
  • Values from a distribution
  • Multiple values from a multiple distribution (e.g., multivariate normal)

Categorial

  • Samples from set of categorial values
  • UUIDs

Biological

  • Gene sequences

Literature-based

  • DOIs (both real and fake/non-resolving)
@sckott
Copy link
Collaborator

sckott commented Jan 24, 2017

thanks @noamross

i think most can be done easily

@sckott sckott added the feature label Jan 24, 2017
@sckott
Copy link
Collaborator

sckott commented Jan 27, 2017

changes in 0057a47 adding some of the above

@sckott sckott modified the milestone: v0.1 Jan 28, 2017
@sckott
Copy link
Collaborator

sckott commented Jan 29, 2017

Geographic:

  • lat/long in bounds
  • place names generated by selecting location names a given type closest to lat/lon coordinates - not clear what this means
  • Countries, U.S. states, etc.

Numeric

  • Value in range for integers and doubles
  • Values from a distribution
  • Multiple values from a multiple distribution (e.g., multivariate normal) - which distribution to use?

Categorial

  • Samples from set of categorial values - not sure what this means
  • UUIDs - there's a cran package to do this, i guess could just import that, but maybe just point people to that if they want uuids

Biological

  • Gene sequences

Literature-based

  • DOIs (both real and fake/non-resolving)

@sckott sckott modified the milestone: v0.1 May 30, 2017
@iembry-USGS
Copy link

I am suggesting fake data for business transactional data:

  • order ID
  • location ID
  • Product ID
  • Purchase Date
  • Price Paid
  • Product Type (Types of Clothing, Types of Electronics, etc.)
  • Company Name - specific for industries (Clothing, Sports, Books, etc.)

Thank you.

Irucka

@sckott
Copy link
Collaborator

sckott commented Jun 26, 2017

thanks @iembry-USGS - out of curiosity, would you use these for work/personal use - and just in english or other languages as well?

@iembry-USGS
Copy link

@sckott You're welcome. I would use them for work and just in English.

Thank you.

Irucka

@sckott
Copy link
Collaborator

sckott commented Jul 22, 2017

thanks @iembry-USGS

opening a new issue to discuss your ideas - continue in #52

@sckott
Copy link
Collaborator

sckott commented Dec 2, 2017

closing, not sure i'll get to the other ones here

@sckott sckott closed this as completed Dec 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants