Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

normalize and denommalize issue #8

Open
Nathan-zh opened this issue Mar 31, 2022 · 4 comments
Open

normalize and denommalize issue #8

Nathan-zh opened this issue Mar 31, 2022 · 4 comments

Comments

@Nathan-zh
Copy link

Hi Brandon,

I recently install and use the dataset from your package. Thanks for your work to build this benchmark.

I have a problem with the function task.normalize_x.

# load task
task = design_bench.make('AntMorphology-Exact-v0', relabel=False)
# get the top 128 features
aa = task.x[np.argsort(np.squeeze(task.y))[-128:]]
# predict the labels
r1 = np.squeeze(task.predict(aa))

# normalize and denormalize features
bb = task.normalize_x(aa)
cc = task.denormalize_x(bb)
# predict the labels again
r2 = np.squeeze(task.predict(cc))

print(np.max(r1), np.max(r2)) #--> 198.7532   406.76566
print(np.where((aa-cc)>0.001))  #--> (array([], dtype=int64), array([], dtype=int64))

The output shows that normalization and denormalization don't change the features but predictions are quite different. Is there anything wrong with my codes? I feel it's a trivial issue. But I cannot figure out where the problem is.

Nathan

@brandontrabucco
Copy link
Owner

Hi Nathan-zh,

Thanks for bringing this issue to my attention about the AntMorphology task! From the snippet you provided, the issue may be caused by floating point errors that occur when computing the mean and standard deviation for normalizing the designs. The relevant code is at this location:

def update_x_statistics(self):
.

In essence, the mean and standard deviation statistics are calculated in a fashion that does not require the entire dataset to be in memory at once (a design choice I made to accommodate the inclusion of larger MBO tasks in the future that require loading the dataset directly from the disk per (x, y) pair). But, this could be exacerbating floating point errors in the calculation of the normalization statistics, as a possible explanation for the difference you are seeing above.

Switching to float64 for the AntMorphology task might be necessary if the problem is due to floating point error.

@Nathan-zh
Copy link
Author

It does help if I switch to float64. I think this numerical problem is caused by the distribution of features, i.e. most values are around 0 but a few values could be as large as 200-300.

@Nathan-zh
Copy link
Author

Hi Brandon,

Here is another issue about the Hopper Controller task. I use the exact oracle model, which means predictions should be identical to labels. Maybe it could be a little different as the oracle is a simulator.

task = design_bench.make('HopperController-Exact-v0', relabel=False)
pred = task.predict(task.x[:10])
print(np.squeeze(pred))
--> [59.14225  72.68841  57.52715  59.30107  68.945305 95.25469  54.364407
 58.248234 57.225212 55.378292]

print(np.squeeze(task.y[:10]))
--> [108.34371  128.48705  103.78237   92.259224 147.93976  124.293274
 117.06348  148.98955  133.39757  101.68808 ]

But the output of this snippet is not what I expected. Would you please test this code? Thanks!

Nathan

@Nathan-zh Nathan-zh reopened this Apr 3, 2022
@brandontrabucco
Copy link
Owner

brandontrabucco commented Apr 3, 2022

Thanks for pointing this out, I currently think the issue is due to the original dataset being collected with a stochastic policy, but in order to speed up evaluation, I implemented the oracle for this task as deterministic in the benchmark, so that we don't need to average the performance of more than one rollout.

There is a pull request about this I have yet to merge, I'll let you know once I do:
#3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants