Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result differences with respect to Lens #35

Open
lplana opened this issue Aug 30, 2020 · 1 comment
Open

Result differences with respect to Lens #35

lplana opened this issue Aug 30, 2020 · 1 comment
Assignees

Comments

@lplana
Copy link
Member

lplana commented Aug 30, 2020

@joannavioletmoy reports result differences with respect to Lens when training network rand10x40 for 300 epochs, testing after every 10 epochs.

  • using steepest descent :
    The number of examples correct after each test differs wildly between PDP2 and LENS. PDP2 is currently getting about three quarters of the examples correct before LENS even manages a single one.
  • using Doug's momentum:
    LENS starts getting the examples correct before PDP2 does, although not so markedly different in this case.

Although differences are expected due to fixed-point (PDP2) vs double (Lens) numeric representation, further verification is needed because implementation issues could also be the cause.

@lplana lplana changed the title Result differences with respect to Lens need to be reviewed Result differences with respect to Lens Aug 30, 2020
@lplana
Copy link
Member Author

lplana commented Apr 28, 2021

@joannavioletmoy reports:

The output of one unit is 'flipped' in PDP2 relative to LENS in epoch 7 (nearly 1 in PDP2, nearly 0 in LENS). However, this appears to be because the dot products from which the value is calculated are also wildly divergent and have opposite signs (9.06 in PDP2 and -12.75 in LENS). This seems to reflect the fact that by this epoch, most of the weights in LENS are negative, whilst still being positive in PDP2. Closer inspection shows increasing differences in the weights with each epoch.

I have noticed big differences between the calculated values of the output derivatives in PDP2 and LENS. I've traced these right back to the first epoch and found that although at this stage the differences are relatively small, they seem to be much larger for units whose target is 1 than those who target 0. This appears to be the case across all the examples in the example set. Thus, I suspect the pattern may be being caused by the fact that we currently represent a target of 1 as 0.999969. However, I'm not 100% sure about this, nor have I established for definite whether this is the cause of the 'flipped' output value I am seeing in epoch 7, so I'm still looking into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants