Skip to content
This repository has been archived by the owner on Jan 3, 2018. It is now read-only.

Script to test index.html for validity #341

Merged
merged 16 commits into from Apr 5, 2014

Conversation

philippbayer
Copy link
Contributor

For issue #320

This is by no means exhaustive but I think a good start.

7. instructor list should be a valid Python/Ruby list
'''

import sys as _sys
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the first time that I see this.

If you have plans to use a variable named sys you should use sys_ following PEP8:

single_trailing_underscore_: used by convention to avoid conflicts with Python keyword

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good that you're as confused about this as I am!

setup/swc-installation-test1.py and setup/swc-installation-test2.py do this, so I thought this was some kind of internal standard that I knew nothing about, so I copied it: cargo-cult programming. Will delete this in the next commit.

@rgaiacs
Copy link

rgaiacs commented Feb 25, 2014

@DrSnuggles, Thanks for the contribution. =)

@philippbayer
Copy link
Contributor Author

Cool! Check the newest commit, I put in all of your comments except that I left the check for countries like it is (for now?). The script also now checks for that template header and warns the user if it still exists.

Edit: Also, currently the script assumes that there is a email address, but the Jekyll code in index.html puts in the site's contact if the email address isn't specified. Is that behavior still wanted? Then I'd have to change the categories rule a bit.

filename = args[1]

sys.stderr.write('Testing file "%s".\n' %filename)
index_fh = open(filename)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to close the file. What about use with?

@rgaiacs
Copy link

rgaiacs commented Feb 26, 2014

@DrSnuggles Thanks for all the help with this script.

Could you add the check function for all the categories, including layout (just check if the value is bootcamp)?

The script also now checks for that template header and warns the user if it still exists.

Nice.

Edit: Also, currently the script assumes that there is a email address, but the Jekyll code in index.html puts in the site's contact if the email address isn't specified. Is that behavior still wanted? Then I'd have to change the categories rule a bit.

@gvwilson @amyrbrown @arlissc What is the wanted behavior?

@philippbayer
Copy link
Contributor Author

I think that I now got all categories, and it now uses the "with" operator.

If someone could test this under Windows I'd be grateful - so far only tested on Fedora 19, python 2.7.5 and 3.3.2

@rgaiacs
Copy link

rgaiacs commented Mar 5, 2014

@DrSnuggles Sorry for the delay to do another review round.

  • What about move the script from setup/swc-index-test.py to bin/validate_index.py? Look like that the setup folder is use for script that help students when configuring their machines.
  • The bootcamp metadata is in Yaml format. What about using PyYAML to handle it? The get_header() function could return the result of PyYAML's load function instead of just a string. This approach has a few advantages, e.g. is the approach used when build our site.
  • Will be great if we have some test functions. I can write the tests as part of the review process and send it as a PR for you if you OK with that.

@philippbayer
Copy link
Contributor Author

Hi,

thanks for the comments! No worries about the tiny delay, I'm a bit busy right now anyway.

I'll implement the suggestions, pyYAML is a way better idea since then we wouldn't have to worry about white-spaces etc.

if you want to write tests, go ahead :)

@philippbayer
Copy link
Contributor Author

OK, done! I'm now using pyyaml.load() instead of pyyaml.safe_load(), since it has a few more features (for example, a valid date is automatically a datetime.date object), and I don't think this index-checker is in any way security relevant. I had to adjust a few tests a little bit.

- Add unit test to all check functions
- Add unit test to file handler

Changes in swc_index_validator.py

- Remove some calls to `strip()`

  They aren't necessary any more because we use PyYAML.
- Changes in `check_contry()`

  Use only one conditional
- Change in `check_intructor()`

  Check if the list isn't empty.

TODO:

- Fix the unit test that fail.
@rgaiacs
Copy link

rgaiacs commented Mar 6, 2014

@DrSnuggles I just send you a PR with the unit tests.

There is two last thing that I would like before accept this PR:

  • There is two test that fail and will be great if they pass.
$ nosetests bin/test_swc_index_validator.py
...
FAILED (failures=2)

@philippbayer
Copy link
Contributor Author

Cool, thank you!

I've introduced extra-checks to fix the tests (humandate's year has to be all-numbers, humandate's day has to contain some numbers, humantime has to contain some numbers), and I fixed a minor issue with the test's StringIO behaving differently under Python2, now it should work under both 2 and 3 (at least it does on my Fedora).

By the new string formatting method, do you mean to replace all instances of

"My name is %s" %(foo)

by

 "My name is {0}".format(foo)

?

@rgaiacs
Copy link

rgaiacs commented Mar 8, 2014

@DrSnuggles The answer to your question is YES.

Sorry for the delay to reply your question. The email notification send by GitHub don't have your question. (Did you edit/update your last comment? I would like to discovery why this happen to avoid it happen again.)

And thanks to make it work with Python2.


def check_layout(layout):
'''Checks whether layout equals "bootcamp".'''
if layout != 'bootcamp':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return layout == 'bootcamp'


def check_humantime(time):
'''A valid humantime contains at least one number'''
return bool(re.match(HUMANTIME_PATTERN, time.replace(" ","")))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the bool conversion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is (I think) that the AND operator I use on the returned function expects a boolean variable:

TypeError: unsupported operand type(s) for &=: 'bool' and '_sre.SRE_Match'

I could wrap the re.match into an if-statement, but that is a bit ugly:

if re.match(re.match(HUMANTIME_PATTERN, time.replace(" ","")):
   return True
return False

So I thought that the bool-conversion is the "laziest" and nicest way.. could I make it nicer?

gvwilson pushed a commit that referenced this pull request Apr 5, 2014
Script to test index.html for validity
@gvwilson gvwilson merged commit 76585ed into swcarpentry:master Apr 5, 2014
wking pushed a commit to wking/swc-boot-camps-v2 that referenced this pull request Sep 8, 2014
Script to test index.html for validity
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants