Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern [0-9,\s]{5} generates whitespace symbols that user probably didn't expect - make configurable? #77

Closed
mockmotor opened this issue May 18, 2023 · 5 comments · Fixed by #87
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@mockmotor
Copy link

mockmotor commented May 18, 2023

A simple pattern [0-9,\s]{5} generates strings like (in hex):

0035 0031 0020 000d 000c
0009 0035 000c 0035 0035

I.e. using CR and LF and also symbols such as FF (form feed) and TAB.

Formally, it is correct - those control characters match the \s format.

However, in practice, users only think of spaces and maybe tabs when using the \s format.

It would be convenient to make this behaviour configurable via options.

I propose three options:

  • \s expands only into space character
  • \s expands into space and tab
  • \s expands into any symbol in Character.isWhitespace()

In my code, I manually replace the whitespace characters with spaces. It is a workaround, but a bit ugly :)

@mockmotor mockmotor added the enhancement New feature or request label May 18, 2023
@curious-odd-man
Copy link
Owner

Hello! Just out of curiosity - if you say that this pattern gives you unexpected characters, why don't you use a pattern that would produce only expected values?
I mean instead of \s you could use space and tab characters - then only those characters would be used.

@mockmotor
Copy link
Author

mockmotor commented May 23, 2023

Hey there.

The schemas (WSDLs) I receive are not mine. Those are from external teams. My goal is to generate reasonable mock payloads for the requests and responses - I maintain a mock service, https://mockmotor.com .

So when the external schema contains types like

<simpleType name="SpinID">
    <restriction base="string">
      <pattern value="[0-9,\s]{14}" />
    </restriction>
</simpleType>

They don't expect to see new lines in the generated values, let alone the tabs and LF.

In fact, when a value can include a multiline value or a general text, they just use type xs:string.

The very fact of using a pattern signals to me that the value is very likely a single-line value.

@curious-odd-man
Copy link
Owner

Hello! Thank you for the explanation. I'll implement the requested change.

@curious-odd-man curious-odd-man added this to the Version 1.5 milestone Jun 1, 2023
@curious-odd-man curious-odd-man linked a pull request Jun 17, 2023 that will close this issue
curious-odd-man added a commit that referenced this issue Jun 18, 2023
@curious-odd-man
Copy link
Owner

Hello @mockmotor !

This is now implemented as part of 2.0 release.
2.0 version will be release in future, for now - you can test/use this feature from the 2.0-SNAPSHOT version

<project>
    <repositories>
        <repository>
            <id>snapshots-repository</id>
            <url>https://oss.sonatype.org/content/repositories/snapshots/</url>
        </repository>
    </repositories>

    <!--  .... -->

    <dependency>
        <groupId>com.github.curious-odd-man</groupId>
        <artifactId>rgxgen</artifactId>
        <version>2.0-SNAPSHOT</version>
    </dependency>
</project>

@mockmotor
Copy link
Author

Excellent, thanks! I will test it for the next week's build!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants