Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

entries have addition unset fields #176

Closed
hendrixjoseph opened this issue Feb 6, 2018 · 56 comments
Closed

entries have addition unset fields #176

hendrixjoseph opened this issue Feb 6, 2018 · 56 comments
Labels

Comments

@hendrixjoseph
Copy link

Staticman entries are having fields that are not set. For instance, with this entry on the popcorn demo site:

eduardoboucas/popcorn#2220

It has fields "layout" and "message" which, as far as I can tell, should not be there. For instance, this older entry does not have these extra fields:

eduardoboucas/popcorn#2196

It also seems these fields are overwriting any fields that have the same name. This entry on my site:

hendrixjoseph/hendrixjoseph.github.io#135

Has a blank entry for "message" which should not be empty.

@rjtngit
Copy link

rjtngit commented Feb 6, 2018

This is happening on my site as well. I renamed the field as a workaround in the meantime.

@IanCaio
Copy link

IanCaio commented Feb 16, 2018

Just made a quick test and it's confirmed here too:

iancaio/staticmanapptest#7

@IanCaio
Copy link

IanCaio commented Feb 17, 2018

I made another test and the layout, timestamp and message fields are gone, but now there's a new unset field being added: approved: false. That's very odd, I wonder if the instance of staticman running on the website is the same as the one here on Github, because I've been hunting the source code for leads and haven't found anything so far.. Maybe @eduardoboucas is testing some new features?

iancaio/staticmanapptest#8

@mmistakes
Copy link

Noticing the same thing across a couple repos.

Not exactly sure how to reproduce it as it doesn't always happen. The only common pattern I've noticed is message: ' ' will be blanked out when these extra fields start showing up (in my case approved, title, and layout.)

Example:

_id: 41f07470-17bf-11e8-b61b-7b344215416b
message: ' '
url: ''
approved: false
title: Comment
layout: post

mmistakes/made-mistakes-jekyll@25eef09

mmistakes/minimal-mistakes@48c1dfb

Xabacadabra/Xabacadabra.github.io@cc03062

@eduardoboucas
Copy link
Owner

This is bizarre. I’m not testing any new feature (and wouldn’t do so on the live API anyway). I’ll try to look into this in the next couple of days.

@eduardoboucas
Copy link
Owner

Trying to find a pattern common to people having this issue. Do you have moderation enabled or disabled? Are you all using Jekyll?

layout: post sounds like something Jekyll would use, but why would that be sent to Staticman? It's really confusing!

@MiKatre
Copy link

MiKatre commented Feb 22, 2018

This happened to me using Hexo with moderation enabled. However, the issue disappeared a week ago when i added new fields.

@mmistakes
Copy link

Jekyll and moderation on both sites. The strange thing is one doesn’t have a post layout yet somehow it’s being assigned.

@hendrixjoseph
Copy link
Author

shouldn't the "allowedFields" parameters cause staticman to deny any extra fields submitted?

@hendrixjoseph
Copy link
Author

I'm trying to capture the actual post request and response in my browser's developer tools, but of course I can't even duplicate the bug right now.

@hendrixjoseph
Copy link
Author

details for me: GitHub pages (Jekyll), StaticMan V2, moderation enabled

@IanCaio
Copy link

IanCaio commented Feb 23, 2018

Details: GitHub Pages (Jekyll), Staticman V2, Moderation enabled.

Right now I can't replicate the bug. The last time I could was 2 days ago, where the approved and title fields were still being added.

@JokerQyou
Copy link

I'm using Hugo with Staticman V2, moderation enabled. Observed additional fields varies from entry to entry. But all the entries were created by spam bots this year. Some of the fields are commentDate, tags and layout.
Although with moderation enabled, these entries will not be published to my site directly, but it is possible for spam bots to blow up Github pull requests.

@eduardoboucas
Copy link
Owner

Sorry for the slow response from my part on this issue. This is a really difficult issue to replicate.

First of all, has anyone seen an instance of this where the submission was not sent by a spam bot? This is important to understand whether Staticman is simply processing the fields it receives, or whether it's somehow injecting random fields on its own (I think the latter is very unlikely).

Also, can each of you confirm whether you're using allowedFields property in your Staticman configs? As @hendrixjoseph rightly pointed out, this should block entries with fields that are not in that whitelist. I did find an issue with this, which made it so that if a field that was not in allowedFields was sent in the payload and its contents were empty, it did go through as an empty string. This has been fixed.

This doesn't explain, however, reports of extra fields being added with content (e.g. layout: "post"). The only explanation that fits my current theory is that spam bots are sending this and site owners do not have an allowedFields property.

@JokerQyou
Copy link

JokerQyou commented Mar 17, 2018

Yes I am, in staticman.yml: allowedFields: ["name", "email", "url", "content", "reply_to"]. With this config I'm still getting extra fields like I previously said.

Actually, the first example given by @hendrixjoseph is what you are looking for: the popcorn site has allowedFields set, and it was receiving pull requests with extra fields. Also, I don't think that one was submitted by a spam bot. eduardoboucas/popcorn#2220

@eduardoboucas
Copy link
Owner

With this config I'm still getting extra fields like I previously said.

Are you able to replicate this yourself?

@JokerQyou
Copy link

No, I cannot. All the pull requests in my repo with this problem were submitted by spam bots.

@eduardoboucas
Copy link
Owner

I really don't understand how they are coming through. 😞

@eduardoboucas
Copy link
Owner

I have added some logging at request level for debugging, so I'll keep an eye on it in the next few days to see what's up.

@JokerQyou
Copy link

JokerQyou commented Mar 19, 2018

So another new comment went through 3 hours ago, I wonder if you caught it in log. The file content is as follows:

_id: e7b19280-2b5c-11e8-86f2-611b1a84ae15
content: "I Ԁdo know!? Sɑid Larry. ?I wager he likes angeⅼs aas a reѕult of hhe has \r\ntheem around all of the time. Possibly he and thе \r\nangels play houseһold gaеs like we do sometіmes. Possibly they play \r\nMonopoly.? This mwde Mommy chortle really hard."
name: example.net
email: 29218ddb8706250663d6999551c2c519
url: 'http://example.net/'
date: 1521453866
layout: post

Notice: I've replaced the website URL and user name with an example domain, other characters remain unchanged.

@eduardoboucas
Copy link
Owner

eduardoboucas commented Mar 19, 2018

@JokerQyou Can you show me the contents of your staticman.yml? This is really strange. I did catch it in the log and there's no mention of date or layout fields anywhere in the request, so Staticman is inserting them on its own somehow!

@mmistakes
Copy link

I made a test comment to the Popcorn demo site earlier today, and noticed it was leaking layout into the datafile.

@eduardoboucas
Copy link
Owner

This is super useful information, thank you both. I'll keep working on tracking it down.

@JokerQyou
Copy link

JokerQyou commented Mar 19, 2018

date is one of generatedFields, so that field is natural. Here is my config file (staticman.yml):

comments:
  allowedFields: ["name", "email", "url", "content", "reply_to"]
  allowedOrigins: ["example.net"]
  branch: "master"

  commitMessage: "Staticman: new comment posted"

  filename: "comment-{@timestamp}"

  format: "yml"
  extension: "yaml"

  generatedFields:
    date:
      type: date
      options:
        format: "timestamp-seconds"

  moderation: true

  name: "XXX"

  path: "data/comments/{options.entryId}"

  requiredFields: ["name", "email", "content"]

  transforms:
    email: md5

@IanCaio
Copy link

IanCaio commented Mar 19, 2018

For a while the problem seemed to vanish, but seeing the updates here I tested once again and the extra fields are being injected. No spam bots, just submitting manually from this page.

staticman.yml:

test:
  allowedFields: ["name", "age"]
  branch: "master"
  filename: "{@id}"
  format: "json"
  moderation: true
  path: "_data/test"
  requiredFields: ["name", "age"]

Last pull request from Staticman:

https://github.com/IanCaio/staticmanapptest/pull/16

@eduardoboucas
Copy link
Owner

I'm really sorry, but I haven't had the time to look into this. I'm starting to think that keeping a public instance of Staticman isn't feasible unless there's more people available to look after it.

@chmac The public instance is running on Heroku. master is mirrored to api.staticman.net and dev is mirrored to dev.staticman.net, so whenever you merge something to one of these branches, the changes are deployed. I can give you access to the repository, so you can play with the dev instance? (Just be careful with master 😅)

I appreciate all the time everyone is putting into tracking this down and, again, apologies for my lack of availability to look into this.

@chmac
Copy link
Collaborator

chmac commented Jun 13, 2018

@eduardoboucas Totally understand, open source is hard, especially infrastructure. Really appreciate all the work you've put in already.

I'm definitely happy to debug. Can I see logs on heroku? That would be only challenge. You can also lock master btw, so only you are allowed to commit to it. Maybe not a bad idea, at least at first.

chmac added a commit to GatsbyCentral/staticman that referenced this issue Jun 13, 2018
chmac added a commit to GatsbyCentral/staticman that referenced this issue Jun 13, 2018
chmac added a commit to GatsbyCentral/staticman that referenced this issue Jun 13, 2018
eduardoboucas added a commit that referenced this issue Jun 13, 2018
Log `fields` multiple times per request. #176
@chmac
Copy link
Collaborator

chmac commented Jun 13, 2018

I've added more debugging calls. But now that the app has been restarted, the problem has disappeared. I think it will only appear after some time. If you see this issue from now on, please let me know ASAP and hopefully the extra logs will shed some light on the issue.

For now, I guess all we can do is wait...

chmac added a commit to GatsbyCentral/staticman that referenced this issue Jun 13, 2018
chmac added a commit to GatsbyCentral/staticman that referenced this issue Jun 13, 2018
chmac added a commit to GatsbyCentral/staticman that referenced this issue Jun 13, 2018
chmac added a commit to GatsbyCentral/staticman that referenced this issue Jun 13, 2018
chmac added a commit to GatsbyCentral/staticman that referenced this issue Jun 13, 2018
chmac added a commit to GatsbyCentral/staticman that referenced this issue Jun 13, 2018
chmac added a commit to GatsbyCentral/staticman that referenced this issue Jun 13, 2018
This should remove all of the `logger.info()` calls we used to track down eduardoboucas#176.
eduardoboucas added a commit that referenced this issue Jun 13, 2018
ntsim pushed a commit to ntsim/staticman that referenced this issue Jul 12, 2018
This should remove all of the `logger.info()` calls we used to track down eduardoboucas#176.
ntsim pushed a commit to ntsim/staticman that referenced this issue Jul 17, 2018
…ig tests

This change primarily addresses the Staticman tests. It appears that most
Staticman tests had been ignored by accident via an errant `jest.only`
and were not actually running.

Some of the site config tests were also failing and logging unnecessarily
(this is due to a log statement from eduardoboucas#176 still remaining in the codebase).

The site config test helper `getConfig` has also been refactored to remove
the special handling of the `recaptcha.secret`. Upon removal there doesn't
seem to have been any notable problems and tests are passing (after some
cleanup in relevant areas).
@fbnlsr
Copy link

fbnlsr commented Sep 18, 2018

Any idea if this issue is really fixed? I'm still getting dozens of spam per week, with empty fields added to the list of allowed ones. I've set up a honeypot on my site but no luck so far, the spam keeps getting through.

@hendrixjoseph
Copy link
Author

@fbnlsr what is the site / repo you're getting this issue on?

@fbnlsr
Copy link

fbnlsr commented Sep 19, 2018

@chmac
Copy link
Collaborator

chmac commented Sep 19, 2018

@fbnlsr I had a quick look at your repo and the closed PRs, but I couldn't find any comments with additional fields added. I saw spam, but only your specified fields. Can you point us at a PR which has the extra fields?

@fbnlsr
Copy link

fbnlsr commented Sep 19, 2018

You're right, I might have confused additional fields with how GitHub presents the PR and formats the table.

I guess I need to work on my spam filter then, I've set up a honeypot but it seems it's not enough :(

@donaldboulton
Copy link

donaldboulton commented Sep 20, 2018

I am real aware of the spam issue, could not stop it so I switched to disqus, even with recaptcha and honey pots and form validation js = spam spam...
Recaptcha should be taken out of staticman api and let the users set it up on there forms. Several conflicting issues. Staticman node modules are way out of date and have issues = staticman problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests