Ticket 18903: GBPhoneNumberField #356

wants to merge 48 commits into


None yet

4 participants


This pull adds support for fields representing UK phone numbers, as well as tests and documentation.


Your fork didn't pick up the vital change in
from September 7th as I added that change a bit later in the day. Please add it to your repo. :-)

Thanks for your input so far. The conversion is fantastic. I have this code working and tested in Java and PHP already so I know the RegEx patterns used for those are 100% correct.

On September 10th I imported your four changes from September 7th and 8th into my repo.
On September 11th I made seven further commits to my repo g1smd/django but have no idea how to get them across to your stream.

I have changed the two RegEx patterns to multi-line format as that makes them much easier to read. I also made several small but vital changes to those patterns. (I'll update the wiki examples later).

Note that it is the valid_format pattern that stops numbers with incorrect NSN length making it through to later stages. This one allows a very wide range of input formats but chucks out a lot of garbage.

By splitting input format checking, NSN extraction, number range and length checking, and the output number formatting into separate processes much more detailed checks can be made at each stage.

I have also added a comment block listing accepted number formats for entry.

The final commits add a space after the +44 prefix, and fixes multiple identical cut and paste typos made in an earlier commit and some other typos and fixes.

Please update your repo to include all of this new code before making further changes, as I can then easily track the diffs after that point.

Thanks again for your fantastic effort in fixing up my broken python guesswork - and please check that I haven't introduced a silly syntax typo somewhere in the new code. In particular I don't know when to use ' or """ around values.

Django member

@brad: You probably want to read https://docs.djangoproject.com/en/dev/internals/contributing/writing-code/working-with-git/ and given https://groups.google.com/forum/?fromgroups=#!topic/django-developers/4EvlunsJ0LY I suggest you delay your work to after 1.5 since I highly doubt we'd merge your stuff before moving localflavor out. Btw why do you need such crazy checking at all? Even if the regex says a phone number is valid it doesn't mean it actually exists, so you have to call them to verify…


Unlike the US and Canada where phone numbers always have 10 digits including a 3 digit area code, GB numbers have area codes from 2 to 5 digits in length and total number length of 9 or 10 digits - and sometimes mixed length numbers within the same area code. There's a lot more scope for making an error, but a few RegEx patterns can eliminate the vast majority of issues. The idea here is to eliminate the worst of the errors.

Additionally, many users don't always format their number correctly. Someone might write the London number 020 3000 5555 as 02030 005 555 or as +44 203 000 5555 akin to someone writing the New York number 212-555-7777 as 2125-557-777 or as +12-125-557-777. The user should be allowed to do that and then a few lines of simple code can reformat the number correctly for display.

International numbers are also often misunderstood, with people writing +44 20 3000 5555 as 00 44 20 3000 5555 (which won't work from the US or Canada) or as 011 44 20 3000 5555 (which won't work from Europe). These issues are also easily solved and the number can then be presented in the correct format.


@apollo13 Thanks for pointing that out. I am aware that localflavors is getting moved out and decided to take my chances anyway. It's not the end of the world to me if I have to reapply the patch there.

@g1smd Can you please work on your changes and make sure they apply cleanly to my ticket_18903 branch? I'm getting conflicts and it is not clear the best way to resolve them.


I'm not sure what to do. There's thousands of changes made by other people. I've no idea which apply or not.


Yes, I've got the latest changes from django master in my branch. A merge should automatically accept all those thousands of changes without issue.


All done.

I had to manually edit gb/forms.py again (actually just replaced it with a copy I already had as a backup in another folder).


I have also refactored the RegEx patterns slightly and made them more efficient.


That is much better, except those regexes aren't working for me (the unit tests fail). Can you explain what the differences are?


If they all fail, I assume a typo. The new patterns are more strict and reject various area codes, prefixes and number ranges that do not exist and allow a range of dial prefixes with a variety of punctuation in various places.

Some of the RegEx patterns begin r' and others begin r""". I don't understand the difference. These RegEx patterns are working in another application, so I guess it's a python syntax problem/typo. I have now changed the ' to """ in several places.

Are you aware that the output is now formatted, rather than just a string of digits? The number format conforms to that listed at: http://www.aa-asterisk.org.uk/index.php/Number_format

Updated tests. Any further problems?


Yeah, same problem as before, error: bad character range when parsing the gb_number_parts regex.
The """ just allows for multi-line strings. I don't think inline comments can be used in these expressions, but even with them removed I see the same error.


OK. That's something concrete to go on.

Found a likely candidate, and fixed it. The '[\s-\d]' should probably be '[\s\d-]' here.

Inline comments should be usable with the 're.X' flag if it works anything like the PHP RegEx '/x' flag. Just be sure there are no slashes in the comments.


Good work, the tests are now passing!



I'm not fazed by RegEx issues.

I have no idea what a pep8 is, but thanks for taking the time to do that too. :-) Good stuff.

What happens next?


Is it worth adding more test data?

0114 223 4567 => Valid: +44 114 223 4567
01145 345 567 => Valid: +44 114 534 5567 <= typo (fixed in later commit)
+44 1213 456 789 => Valid: +44 121 345 6789
00 44 (0) 1697 73555 => Valid: +44 16977 3555
011 44 11 4890 2345 => Valid: +44 114 890 2345 <= typo (fixed in later commit)
025 4555 6777 => NOT VALID
0119 456 4567 => NOT VALID
0623 111 3456 => NOT VALID
0756 334556 => NOT VALID


Yes, ideally we could cover all cases covered by the regular expressions.


With optional spaces, hyphens and brackets, and the various area code and number lengths, there's over 400 valid combinations.


I'm not too concerned about space/hyphen variations, since those cases are covered simply by removing those characters from the string. I'm more concerned with different area code lengths which I did not cover in my original test set.


Those listed above, plus these:

020 3000 5000 => Valid: +44 20 3000 5000
0121 555 7777 => Valid: +44 121 555 7777
01750 615777 => Valid: +44 1750 615777
019467 55555 => Valid: +44 19467 55555
01750 62555 => Valid: +44 1750 62555
016977 3555 => Valid: +44 16977 3555
0500 777888 => Valid: +44 500 777888
020 5000 5000 => NOT VALID
0171 555 7777 => NOT VALID
01999 777888 => NOT VALID
01750 61777 => NOT VALID


Can you please specify what the formatted result should be for each valid number?


Amended list above. Also added several 'non-valid' numbers. Added to file and pushed.


The test doesn't pass. Unfortunately it doesn't inform me which number(s) failed. I'll use the process of elimination I guess.


Ack. It'll be a silly typo.


'01145 345 567': '+44 114 534 5567'
'011 44 11 4890 2345': '+44 114 890 2345'
both fail at the valid_gb_phone_range step. I printed out number_parts for each number:
{u'prefix': u'+44 ', u'extension': None, u'NSN': u'1145345567'}
{u'prefix': u'+44 ', u'extension': None, u'NSN': u'1148902345'}

does that look right? Is the problem in the test or the validation?


Yes, stupid typo. I had intended to use the 0141 area code, but typed 0114 which has less valid ranges.

Fixed and pushed. Apologies!


Looks good, I think this is ready for checkin.


Does anything else need amending on the ticket?


I don't think so, now we just wait for it to catch someone's eye and mark it "Ready for Checkin". I don't think either of us should mark it so, since we were both involved in the patch.


I'm going to go ahead and add @apollo13 's comment here since it seems important but it was deleted:

Given this crazy regex I suggest you test if it's vulnerable to backtracking attack (http://www.regular-expressions.info/catastrophic.html -- yes you actually can DOS a server with something like this, we had this in URLField a while ago)


I've not come across that before, but I don't think it applies here because the patterns to match are always a fixed number of digits, and designed to parse left to right - and fail without much in the way of backtracking. There's very limited use of + or * quantifiers which seem to be the main part of the problem described in the article.

However, I'll look into it a bit more.

Django member

@brad githubs ui fooled me :/


@apollo13 hehe, it happens to all of us


Are there any improvements that can be made to error messages?

"Not a valid format and/or number length" for non-matching valid_gb_format.

"Not a valid range or not a valid number length for this range" for non-matching valid_gb_range.


OK, I have added both messages and the tests now check that the error message is correct.



I don't know how deep to take the error checking.

I'm tempted to also add another simple code chunk right near the beginning:

If number matches [^\+\(\)\dext.\#\s-]

with error message

"Input contains invalid characters or punctuation. Use only digits and '+', '(', ')', 'ext.', 'x' and '#' with optional spaces and/or hyphens."

with test number 020@7000 8000 (preferred);

or even simply automatically clear all those unwanted junk characters from the input before processing (not preferred).

Additionally, it would be possible to add

If NOT match (r"""^\(?(?:(?:0(?:0|11)\)?[\s-]?\(?|\+)44\)?[\s-]?\(?(?:0\)?[\s-]?\(?)?|0).*$""")

with error message:

"Number must begin 0, +44, 00 44 or 011 44 followed by 9 or 10 digits and optional 'ext.', 'x' or '#' extension.

with test number 00 33 1 44 55 66 77 (preferred);

or else amend the existing 'Not a valid format and/or number length.' error message to instead read 'Not a valid format and/or number length. Number must begin 0, +44, 00 44 or 011 44 followed by 9 or 10 digits and optional 'ext.', 'x' or '#' extension.' (not preferred).

We also need a test number like 020 7000 9000 x4567 to test extension handling; result +44 20 7000 9000 #4567.

I don't know how much feedback to give the user. I want to keep things simple, but also feel that more specific error messages are a good idea where possible.

It's not easy to give specific warnings about too many digits or too few digits. That will have to be skipped here, relying only on the current "non valid length" message to cover both cases.

There are loads of other checks that could be done - such as warning when multiple contiguous spaces or hyphens or multiple unwanted punctuation is used (or automatically removing it), but there's diminishing returns on module speed. I'm not too bothered about checking for stuff like ((020))---7000---8000 as it should already be obvious to the user that they've already typed way too much extra crud.

Where's the dividing line between interpreting what the user wanted and just getting on with it (hence allowing 00 44 and 011 44 in addition to the usual +44) or saying 'you've botched the input, do it again'? I tend to write more detailed code to make a simpler or more intuitive user experience, but I know that not everyone holds that view perhaps preferring the bare minimum code for improved speed.


I added tests for numbers with extensions. Please review my changes. I think this is as far as we should take the validation. While I appreciate the complexities of these numbers, there is a balance like you say between too much and too little validation. I fear it will be too much if it is taken any further, I think we have struck a good balance here, you are welcome to keep adding to it if you like.


Looks OK.

I'll hold off until there is some user feedback. If users want more detailed error messages they can be added later.


I added a comment to the ticket... ""We think this is 'Ready for Checkin'""" asking for others to take a look.

Django member

Hey there -- django.contrib.localflavor is now deprecated, and we're not making any more changes to it. Could you reopen this pull request for the shiny new package django-localflavor-gb? Here's the link: https://github.com/django/django-localflavor-gb

Sorry we didn't get to this pull request before the deprecation. I hope it's not too much of a pain to migrate this to the new package.


@brad has the latest version. My copy is out of sync.


No problem, I'll take care of it. Thanks for the heads up, Adrian.


Looks good. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment