Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert manufacturer.txt to python dictionary #33 #35

Merged
merged 2 commits into from Nov 13, 2014
Merged

Conversation

bucciarellialessandro
Copy link
Contributor

  • first idea of implementation

@bucciarellialessandro
Copy link
Contributor Author

I created this PR in order to discuss together on the approach to use.
This is not intended to be the final version. As it is, this feature gives the possibility to create a manufacturer.py file from manufacturer.txt. The .py file stores a dictionary containing the key/value pairs of MAC addresses prefix and related manufactuerers. Importing this file the time consuming activity of parsing the file looking for the manufacturer can be avoided.

@nemesifier
Copy link
Member

Great!

I've taken a look at the python dictionary, there are a few improvements that can be done:

  • keys can be stripped of spaces at the beginning and end of the string
  • keys should be stripped of every character that is not a number, I explain later why
  • the order of the keys can be retained by adding sort_keys=True to json.dumps
  • it's probably better to call the file manufacturers.py (the s is missing)

As a further improvement, I was thinking that it would be very handy and it would save us a lot of time in the future if the txt file could be downloaded directly from the internet with an http request, so we can avoid storing it in the repository. Do you think you can do it?

Regarding how to call the code in the library, we could write a simple function in netengine.utils that takes a mac address (accepting both entire or partials) as input and returns the manufacturer name.
The function should be able to handle different formats of mac addresses:

  • aa:00:bb:cc:dd:ee
  • AA:00:BB:CC:DD:EE
  • AA-BB-CC-DD-EE-FF
  • aabbccddeeff
  • even partial mac addresses (just the prefix): aa:bb:cc

If you agree with me we can do these things in 3 separate steps, I can write the final function.

@bucciarellialessandro
Copy link
Contributor Author

I mostly agree with you on all points.
You're right about the improvements on the json dictionary, as soon as possibile I will trim the white spaces and I will add the sorting on the dictionary keys.
Where could be better to store the .txt file? On github itself and hardcode an URL in the code?

As regards to dictionary lookup you're right, I think it would be awesome if we store in the dict the keys as "AABBCCDDFF" and make some operation on the string to be compared.
The partial MAC address comparison can be achieved by trimming all but the first three bytes.

So could be an idea of an operational workflow:

  1. receive the MAC string to look for in the dict
  2. check against the presence of ":" or "-"
  3. trim all but the first three bytes (from MAC[0] to MAC[7]) which is exactly the prefix
  4. replace ":" or "-" with ""
    4a) perform, if necessary, a call to the str upper function on MAC
  5. dictionary lookup
  6. return the matching manufacturer

What do you think? :)

@bucciarellialessandro bucciarellialessandro changed the title Netengine: manufactuers are now stored in .py file Netengine issue#33: manufactuers are now stored in .py file Nov 8, 2014
key = pairs[0].strip('\t\n\r')
value = pairs[1].strip('\t\n\r')
dictionary[key] = value
json_file = json.dumps(dictionary, indent = 4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few suggestions:
indent=4 should be without spaces
also add here sort_keys=True

@nemesifier
Copy link
Member

If the URL of the txt file is stable over time we can hardcode it in parser.py and let the script take care of downloading it and we can avoid storing it in the repository.

The workflow you described is correct, although you can simplify by doing just 4 operations:

@bucciarellialessandro
Copy link
Contributor Author

Why first point? Letters are valid char for a MAC address, aren't they? For example AA:AA:AA(:AA:AA:AA)

@nemesifier
Copy link
Member

Yes sorry strip anything that is not an alphanumeric character [a-zA-Z0-9]

@cl4u2
Copy link
Contributor

cl4u2 commented Nov 9, 2014

MAC address is hexadecimal, so it would be better to strip anything that is not [a-fA-F0-9]

@nemesifier
Copy link
Member

@cl4u2 👍 for [a-fA-F0-9]

nemesifier added a commit that referenced this pull request Nov 9, 2014
@bucciarellialessandro
Copy link
Contributor Author

Hi guys,
I don't agree in stripping all but [a-fA-F0-9]. I'm going to make the point clear.
Image we receive a strange MAC address AA:AA:GG:GG:AA:AA, which is clearly wrong (it has G's).
The strip on G's will produce AA:AA:AA:AA. After some manipulations on the MAC we will obtain AAAAAA, ready for the lookup into the manufacturer's dictionary.
Sadly, this will produce a valid manufacturer because AAAAAA is in the dictionary, even though the initial mac address wasn't.

We could have introduced a bug.
Do you all agree?

import json



Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove 1 line

@nemesifier nemesifier changed the title Netengine issue#33: manufactuers are now stored in .py file Netengine issue #33: manufactuers are now stored in .py file Nov 13, 2014
@nemesifier nemesifier changed the title Netengine issue #33: manufactuers are now stored in .py file Netengine issue #33: convert manufacturer.txt to python dictionary Nov 13, 2014
@nemesifier nemesifier changed the title Netengine issue #33: convert manufacturer.txt to python dictionary Convert manufacturer.txt to python dictionary #33 Nov 13, 2014
@nemesifier
Copy link
Member

👍

nemesifier added a commit that referenced this pull request Nov 13, 2014
Converted manufacturer.txt to python dictionary #33
@nemesifier nemesifier merged commit 2cf6dc7 into master Nov 13, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants