Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code and repo audit #160

Open
apohllo opened this issue Nov 24, 2023 · 39 comments
Open

Code and repo audit #160

apohllo opened this issue Nov 24, 2023 · 39 comments

Comments

@apohllo
Copy link

apohllo commented Nov 24, 2023

README.md

  • I would suggest: NameGuard by NameHash as a title, current title might be confusing
@apohllo apohllo changed the title Testing issue Code audit Nov 24, 2023
@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

A more context could be given in the readme, i.e. why namegaurd could be useful. Some potential use-cases would be valuable.

@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

In the impersonation section giving actual comparison between safe and unsafe names would be very valuable.

@apohllo apohllo changed the title Code audit Code and repo audit Nov 24, 2023
@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

  • API/README.md
    I tried running
pip install nameguard

and got:

ERROR: Could not find a version that satisfies the requirement nameguard (from versions: none)
ERROR: No matching distribution found for nameguard

My env is as follows:
Python: 3.11.3

@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

image

@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

If the pypi package is not yet available, maybe extend the README to include instruction on how to install the lib currently, using poetry.

@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

Add | python -m json.tool in the curl call - this will make the result much more readable.

@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

I don't get the difference between:

image

and

image

The messages are almost the same.

Ok, so one refers to the name as a whole, and the other to individual parts of the name. This should be reflected better in the message. Maybe the "label" part should talk about plural.

@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

This one is pretty obscure (at least looking at the message):

image

@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

At the end when running the non-monkey-patched version, I've got a bunch of errors:

image

I guess some additional service should be running on our machine:

image

This should be mentioned in the readme.

Update: there should be info, that if you get these errors, you should set up those API keys mentioned above.

@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

  • In the project definition file

authors = ["NameHash Team <devops@namehash.io>"]
maintainers = ["NameHash Team <devops@namehash.io>"]
homepage = "https://github.com/namehash/nameguard"
repository = "https://github.com/namehash/nameguard"
readme = "README.md"
license = "LICENSE"

  • Are these emails maintained?
  • It would be better to provide emails to real people, not only the organization catch-all emails.
  • Homepage should be updated
  • License should be changed to MIT

@apohllo
Copy link
Author

apohllo commented Nov 24, 2023

Why conf name is used here?

def grapheme_is_normalized(conf: str) -> bool:
if len(conf) == 1 and ord(conf) in NORMALIZATION.valid:
return True

Why not grapheme?

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

I would suggest merging grapheme_normalization with generic_utils.

There's also utils file, which seems to have a similar purpose.

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

Setting debug as the default log level does not seem to fit general audience

logger.setLevel(logging.DEBUG)

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

This regex is not resistant against new-line injection.

ALCHEMY_UNKNOWN_NAME = re.compile('^\[0x[0-9a-f]{4}\.\.\.[0-9a-f]{4}\]\.eth$')

The anchors should be changed int \A and \z.

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

Since the addresses have different meaning, why not use a dictionary?

ens_contract_adresses = {
'0x57f1887a8bf19b14fc0df6fd9b2acc9af147ea85', # Base Registrar
'0xd4416b13d2b3a9abae7acd5d6c2bbdbe25686401', # Name Wrapper
}

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

Consider missing key error, which would generate obscure error massage for a nested key:

def nested_get(dic, keys):
for key in keys:
dic = dic[key]
return dic

Maybe there should be error showing all elements of the keys list or a None value, if the path does not exist?

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

Two-letter instance variable which is not documented:


Hard to understand it's meaning from the name itself (name-service?).

self.services ora self.name_services would be a better name.

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

Missing return value type:

def analyse_label(self, label: str):
return self._inspector.analyse_label(label, simple_confusables=True, omit_cure=True)

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

The doc says namehashes won't be analyzed, while in fact they might be resolved (according to the argument):

'''
Inspect a name. A name is a sequence of labels separated by dots.
A label can be a labelhash or a string.
If a labelhash is encountered, it will be treated as an unknown label.
'''

Consider changing the doc.

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

What about names such as ...?

labels = [] if len(name) == 0 else name.split('.')

Shall the empty labels always be removed? Or for that case an empty string is ok?
If we allow empty labels in the call, why we don't do the same for an empty name?

image

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

I would suggest changing label_analysis into label

labels_graphemes_checks = [
[
[check(grapheme) for check in GRAPHEME_CHECKS]
for grapheme in label_analysis.graphemes
] if label_analysis is not None else []
# label has [] graphemes if it's a labelhash
for label_analysis in labels_analysis
]

Currently there's little difference between label_analysis and labels_analysis which makes the code harder to understand. The call lable.graphemes would be even more natural.

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

The same remark applies to the following code:

labels_checks = [
[check(label_analysis) for check in LABEL_CHECKS]
# checks have to handle labelhashes
for label_analysis in labels_analysis
]

label would be better than label_analysis.

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

Does the DNA checks require previous aggregation of results?

for check_g, check_l, check_n in DNA_CHECKS:
for label_i, label_analysis in enumerate(labels_analysis):
if label_analysis is not None:
for grapheme_i, grapheme in enumerate(label_analysis.graphemes):
labels_graphemes_checks[label_i][grapheme_i].append(check_g(grapheme))
labels_checks[label_i].append(check_l(label_analysis))
name_checks.append(check_n(labels_analysis))

I don't understand why the application and organization of these checks differs from the other checks.
A short comment would be helpful.

@apohllo
Copy link
Author

apohllo commented Nov 30, 2023

Taking into account that this a constructor:

return NameGuardReport(
name=name,
namehash=namehash_from_name(name),
normalization=Normalization.UNKNOWN
if any(label_analysis is None for label_analysis in labels_analysis)

it's very strange the single call to that method occupies more than 60 lines.

Have you considered a fluent API, a factory pattern or preparation of the argumetns up-front?
And then it also would be possible to move the logic, that currently is in the long part of the code to the specific method related to the arguments of the constructor.

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

The part of the code:

if any(label_analysis is None for label_analysis in labels_analysis)
else Normalization.NORMALIZED
if all(label_analysis.status == 'normalized' and len(label_analysis.label) > 0
for label_analysis in labels_analysis)
else Normalization.UNNORMALIZED,

Is rather hard to follow, since the values assigned to the param are intertwined with the logic responsible for their computation. A method computing the outcome would be much more readable.

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

It would be better to move the logic here:

canonical_name=compute_canonical_from_list(
[label_analysis.normalized_canonical_label
if label_analysis is not None
else labels[i] # labelhash
for i, label_analysis in enumerate(labels_analysis)],
sep='.',
),

into separate method, accepting labels_analysis.

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

The code

normalization=Normalization.UNKNOWN
if label_analysis is None
else Normalization.NORMALIZED
if label_analysis.status == 'normalized' and len(label_analysis.label) > 0
else Normalization.UNNORMALIZED,

has the same interleaved logic/value computation, which is hard to follow.

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

The same applies to this piece of code:

normalization=GraphemeNormalization.NORMALIZED
if any(check.status == CheckStatus.PASS and check.check is Check.NORMALIZED
for check in grapheme_checks)
else GraphemeNormalization.UNNORMALIZED,

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

I understand that this is Python-style:

for label, label_analysis, label_checks, label_graphemes_checks in zip(
labels,
labels_analysis,
labels_checks,
labels_graphemes_checks,

but putting the loop at the end once again makes the code harder to understand, since the variables are defined after they are used.

I would suggest a class/struct that would combine the individual pieces of information into one object.

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

This code and many similar pieces of code could be simplified from

([self._inspect_confusable(c)
for c in grapheme_analysis.confusables_other]
if grapheme_analysis.confusables_other else []),

to

([self._inspect_confusable(c) for c in (grapheme_analysis.confusables_other or [])]

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

Here

grapheme_checks = [check(grapheme) for check in GRAPHEME_CHECKS + [c[0] for c in DNA_CHECKS]]

grapheme relates to the grapheme analysis result. Would be better to make the names more consistent
Cf. this code:

grapheme_analysis = label_analysis.graphemes[0]
grapheme_checks = [check(grapheme_analysis) for check in GRAPHEME_CHECKS + [c[0] for c in DNA_CHECKS]]

There's also some non-trivial logic shared by those methods.

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

For me the name of the method:

async def secure_primary_name(self, address: str, network_name: str) -> SecurePrimaryNameResult:

implicates, that we are securing some name, but the logic is about checking if the name is secure.
validate_security, is_name_secure or some other name with a verb that refers to the proces would make the method more easy to infer it's semantics from.

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

Would it be possible to rewrite this code:

if domain is None:
status = SecurePrimaryNameStatus.NO_PRIMARY_NAME
impersonation_status = None
else:
nameguard_result = await self.inspect_name(network_name, domain)
result = ens_process(domain, do_normalize=True, do_beautify=True)
if result.normalized != domain:
status = SecurePrimaryNameStatus.UNNORMALIZED
impersonation_status = None
else:
display_name = result.beautified
status = SecurePrimaryNameStatus.NORMALIZED
primary_name = domain
impersonation_status = ImpersonationStatus.UNLIKELY if any(check.check == 'impersonation_risk' and check.status == CheckStatus.PASS for check in
nameguard_result.checks) else ImpersonationStatus.POTENTIAL

in a way, that all check are applied step-by-step with the same style of coding?
It's pretty hard to understand when given status will be returned for the name.

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

As far as I understand:

if token_type not in ['ERC721', 'ERC1155'] and contract_address in ens_contract_adresses:
return FakeEthNameCheckResult(status=FakeEthNameCheckStatus.UNKNOWN_NFT, nameguard_result=None, investigated_fields=None)
if token_type == 'NOT_A_CONTRACT':
return FakeEthNameCheckResult(status=FakeEthNameCheckStatus.UNKNOWN_NFT, nameguard_result=None, investigated_fields=None)
elif token_type == 'NO_SUPPORTED_NFT_STANDARD':
return FakeEthNameCheckResult(status=FakeEthNameCheckStatus.UNKNOWN_NFT, nameguard_result=None, investigated_fields=None)
elif token_type not in ['ERC721', 'ERC1155']: # Alchemy does not support other types
return FakeEthNameCheckResult(status=FakeEthNameCheckStatus.UNKNOWN_NFT, nameguard_result=None, investigated_fields=None)

all checks result in the same value being returned. Maybe using or would make to code simpler?

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

As mentioned earlier

try:
name = nested_get(res_json, keys)
investigated_fields['.'.join(keys)] = name
except KeyError:
pass

it would be better to change the logic not to rise a KeyError.

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

Else is not needed

if title is None:
return FakeEthNameCheckResult(status=FakeEthNameCheckStatus.UNKNOWN_NFT, nameguard_result=None,
investigated_fields=None)
else:
if is_labelhash_eth(title):

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

Making nameguard_report and investigated_fields None by default would simplify a lot of code:

return FakeEthNameCheckResult(status=FakeEthNameCheckStatus.UNKNOWN_NFT, nameguard_result=None, investigated_fields=None)

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

Else is not needed:

if is_ens_normalized(title):
return FakeEthNameCheckResult(status=FakeEthNameCheckStatus.AUTHENTIC_ETH_NAME,
nameguard_result=report, investigated_fields=None)
else:

@apohllo
Copy link
Author

apohllo commented Dec 1, 2023

Once again else is not needed since all previous paths result in a return and exit the method.

else:
impersonating_fields = {}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant