New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Architecture issues with Torch.load #42

Closed
bamos opened this Issue Oct 29, 2015 · 37 comments

Comments

Projects
None yet
@bamos
Collaborator

bamos commented Oct 29, 2015

From @ananghudaya in #26:

th> net = torch.load('./models/openface/nn4.v1.t7')
...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:289: table index is nil
stack traceback:
    ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:289: in function 'readObject'
    ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:272: in function 'readObject'
    ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:311: in function 'load'
    [string "net = torch.load('./models/openface/nn4.v1.t7')"]:1: in main chunk
    [C]: in function 'xpcall'
    ...e/ananghudaya/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl'
    ...daya/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk
    [C]: at 0x0804d6d0  

@bamos bamos added the bug label Oct 29, 2015

@bamos bamos added this to the v0.2.0 milestone Oct 29, 2015

@bamos bamos referenced this issue Oct 29, 2015

Closed

Broken Pipe #26

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Oct 29, 2015

Collaborator

@ananghudaya - https://github.com/teradeep/demo-apps/issues/4 indicates this is an architecture issue, which is a problem with torch.load I wasn't aware of, but is clearly in the documentation at https://github.com/torch/torch7/blob/master/doc/serialization.md. I saved the binary model in x86_64 and I think it's only compatible with x86_64. Are you using 32-bit x86 or ARM?

I've saved the model in ASCII format. Can you download and unxz it from here.

$ md5sum nn4.v1.ascii.t7
735723e2c9cc4eefc00a7df34c9a4d3b  nn4.v1.ascii.t7

Try loading it with:

$ th
th> require 'nn'
th> require 'dpnn'
th> net = torch.load('nn4.v1.ascii.t7', 'ascii')

If this works, I think you'll just need to replace nn4.v1.t7 with nn4.v1.ascii.t7 in the Python demos and make add ascii to torch.load in https://github.com/cmusatyalab/openface/blob/master/openface/openface_server.lua.

  • Even though the ascii model is larger, I'll use it in place of the binary one everywhere to avoid issues like this. Thanks for the useful info and for helping me improve the project. I'll make the changes over the next few days.
Collaborator

bamos commented Oct 29, 2015

@ananghudaya - https://github.com/teradeep/demo-apps/issues/4 indicates this is an architecture issue, which is a problem with torch.load I wasn't aware of, but is clearly in the documentation at https://github.com/torch/torch7/blob/master/doc/serialization.md. I saved the binary model in x86_64 and I think it's only compatible with x86_64. Are you using 32-bit x86 or ARM?

I've saved the model in ASCII format. Can you download and unxz it from here.

$ md5sum nn4.v1.ascii.t7
735723e2c9cc4eefc00a7df34c9a4d3b  nn4.v1.ascii.t7

Try loading it with:

$ th
th> require 'nn'
th> require 'dpnn'
th> net = torch.load('nn4.v1.ascii.t7', 'ascii')

If this works, I think you'll just need to replace nn4.v1.t7 with nn4.v1.ascii.t7 in the Python demos and make add ascii to torch.load in https://github.com/cmusatyalab/openface/blob/master/openface/openface_server.lua.

  • Even though the ascii model is larger, I'll use it in place of the binary one everywhere to avoid issues like this. Thanks for the useful info and for helping me improve the project. I'll make the changes over the next few days.
@ananghudaya

This comment has been minimized.

Show comment
Hide comment
@ananghudaya

ananghudaya Oct 30, 2015

Thanks @bamos

Still no luck in getting it right. I've downloaded and verified the ASCII model. Here is the output:

th> net = torch.load('nn4.v1.ascii.t7', 'ascii')
cannot open <nn4.v1.ascii.t7> in mode r  at /home/ananghudaya/torch/pkg/torch/lib/TH/THDiskFile.c:484
stack traceback:
    [C]: at 0xb720afc0
    [C]: in function 'DiskFile'
    ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:309: in function 'load'
    [string "net = torch.load('nn4.v1.ascii.t7', 'ascii')"]:1: in main chunk
    [C]: in function 'xpcall'
    ...e/ananghudaya/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl'
    ...daya/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk
    [C]: at 0x0804d6d0  

I'm using a 32-bit machine.

ananghudaya commented Oct 30, 2015

Thanks @bamos

Still no luck in getting it right. I've downloaded and verified the ASCII model. Here is the output:

th> net = torch.load('nn4.v1.ascii.t7', 'ascii')
cannot open <nn4.v1.ascii.t7> in mode r  at /home/ananghudaya/torch/pkg/torch/lib/TH/THDiskFile.c:484
stack traceback:
    [C]: at 0xb720afc0
    [C]: in function 'DiskFile'
    ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:309: in function 'load'
    [string "net = torch.load('nn4.v1.ascii.t7', 'ascii')"]:1: in main chunk
    [C]: in function 'xpcall'
    ...e/ananghudaya/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl'
    ...daya/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk
    [C]: at 0x0804d6d0  

I'm using a 32-bit machine.

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Oct 30, 2015

Collaborator

Hi Anang, this error looks like Torch can't find the file.
Did you unxz it and check the md5sum?

-Brandon.

  • Anang Hudaya Muhamad Amin :: 2015-10-30 02:49 Fri:

    Thanks @bamos

    Still no luck in getting it right. Here is the output:

    th> net = torch.load('nn4.v1.ascii.t7', 'ascii') cannot open <nn4.v1.ascii.t7> in mode r at /home/ananghudaya/torch/pkg/torch/lib/TH/THDiskFile.c:484 stack traceback: [C]: at 0xb720afc0 [C]: in function 'DiskFile' ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:309: in function 'load' [string "net = torch.load('nn4.v1.ascii.t7', 'ascii')"]:1: in main chunk [C]: in function 'xpcall' ...e/ananghudaya/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl' ...daya/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk [C]: at 0x0804d6d0
    I'm using a 32-bit machine.


    Reply to this email directly or view it on GitHub:
    #42 (comment)

Collaborator

bamos commented Oct 30, 2015

Hi Anang, this error looks like Torch can't find the file.
Did you unxz it and check the md5sum?

-Brandon.

  • Anang Hudaya Muhamad Amin :: 2015-10-30 02:49 Fri:

    Thanks @bamos

    Still no luck in getting it right. Here is the output:

    th> net = torch.load('nn4.v1.ascii.t7', 'ascii') cannot open <nn4.v1.ascii.t7> in mode r at /home/ananghudaya/torch/pkg/torch/lib/TH/THDiskFile.c:484 stack traceback: [C]: at 0xb720afc0 [C]: in function 'DiskFile' ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:309: in function 'load' [string "net = torch.load('nn4.v1.ascii.t7', 'ascii')"]:1: in main chunk [C]: in function 'xpcall' ...e/ananghudaya/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl' ...daya/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk [C]: at 0x0804d6d0
    I'm using a 32-bit machine.


    Reply to this email directly or view it on GitHub:
    #42 (comment)

@ananghudaya

This comment has been minimized.

Show comment
Hide comment
@ananghudaya

ananghudaya Nov 3, 2015

Hi @bamos,

Yes I did. the md5 checksum is similar, and I have placed the file in the same folder as the other models.

ananghudaya commented Nov 3, 2015

Hi @bamos,

Yes I did. the md5 checksum is similar, and I have placed the file in the same folder as the other models.

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Nov 3, 2015

Collaborator

Please double check the path to the model.
The error message you're getting is the same error message I get for incorrect paths.

th> model = torch.load('/tmp/does-not-exist.t7')
cannot open </tmp/does-not-exist.t7> in mode r  at /home/bamos/torch/pkg/torch/lib/TH/THDiskFile.c:484
stack traceback:
    [C]: at 0x7f4389ef2a90
    [C]: in function 'DiskFile'
    /home/bamos/torch/install/share/lua/5.1/torch/File.lua:292: in function 'load'
    [string "model = torch.load('/tmp/does-not-exist.t7')"]:1: in main chunk
    [C]: in function 'xpcall'
    /home/bamos/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl'
    ...amos/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk
    [C]: at 0x00406670
Collaborator

bamos commented Nov 3, 2015

Please double check the path to the model.
The error message you're getting is the same error message I get for incorrect paths.

th> model = torch.load('/tmp/does-not-exist.t7')
cannot open </tmp/does-not-exist.t7> in mode r  at /home/bamos/torch/pkg/torch/lib/TH/THDiskFile.c:484
stack traceback:
    [C]: at 0x7f4389ef2a90
    [C]: in function 'DiskFile'
    /home/bamos/torch/install/share/lua/5.1/torch/File.lua:292: in function 'load'
    [string "model = torch.load('/tmp/does-not-exist.t7')"]:1: in main chunk
    [C]: in function 'xpcall'
    /home/bamos/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl'
    ...amos/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk
    [C]: at 0x00406670

bamos added a commit that referenced this issue Nov 4, 2015

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Nov 4, 2015

Collaborator

The ascii model loads in about 30-45 seconds for me and the x86 binary model loads in a few seconds. I'll add a fallback mechanism when we transition to a Lua server in #4 instead of a Lua subprocess so only non 64-bit x86 users will have the 30 second penalty, and it will only be for the first time they start the server, not every time they try to run a new Python program using OpenFace.

Collaborator

bamos commented Nov 4, 2015

The ascii model loads in about 30-45 seconds for me and the x86 binary model loads in a few seconds. I'll add a fallback mechanism when we transition to a Lua server in #4 instead of a Lua subprocess so only non 64-bit x86 users will have the 30 second penalty, and it will only be for the first time they start the server, not every time they try to run a new Python program using OpenFace.

@snowlord

This comment has been minimized.

Show comment
Hide comment
@snowlord

snowlord Nov 17, 2015

i faced the same problem that torch cant load nn4.v1.ascii.t7. i downloaded nn4.v1.ascii.t7 and checked md5. as @bamos sayed it caused by incorrect path,but i tried absolutely path.it still showed that
cannot open <nn4.v1.ascii.t7> in mode r at /tmp/luarocks_torch-scm-1../torch7/lib/TH/THDiskFile.c :484

snowlord commented Nov 17, 2015

i faced the same problem that torch cant load nn4.v1.ascii.t7. i downloaded nn4.v1.ascii.t7 and checked md5. as @bamos sayed it caused by incorrect path,but i tried absolutely path.it still showed that
cannot open <nn4.v1.ascii.t7> in mode r at /tmp/luarocks_torch-scm-1../torch7/lib/TH/THDiskFile.c :484

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Nov 17, 2015

Collaborator

Hi @snowlord - strange! Can you (or @ananghudaya) try saving a small file in binary format, then loading it? Then doing the same with an ASCII-formatted file?

/tmp$ th
th> t = torch.Tensor(10)
th> torch.save('test-binary.t7', t)
th> t2 = torch.load('test-binary.t7')
th> torch.save('test-ascii.t7', t, 'ascii')
th> t3 = torch.load('test-ascii.t7', 'ascii')
th> t:eq(t2):all()
true
th> t:eq(t3):all()
true

If this works, can you then try doing it in a different directory that's not your current working directory?

Collaborator

bamos commented Nov 17, 2015

Hi @snowlord - strange! Can you (or @ananghudaya) try saving a small file in binary format, then loading it? Then doing the same with an ASCII-formatted file?

/tmp$ th
th> t = torch.Tensor(10)
th> torch.save('test-binary.t7', t)
th> t2 = torch.load('test-binary.t7')
th> torch.save('test-ascii.t7', t, 'ascii')
th> t3 = torch.load('test-ascii.t7', 'ascii')
th> t:eq(t2):all()
true
th> t:eq(t3):all()
true

If this works, can you then try doing it in a different directory that's not your current working directory?

@snowlord

This comment has been minimized.

Show comment
Hide comment
@snowlord

snowlord Nov 24, 2015

hi,@bamos,i changed on the 64-bit x86,i have checked md5 of model file.it showed different problem.

th> torch.load('./models/openface/nn4.v1.t7')
/usr/local/share/lua/5.1/torch/File.lua:294: unknown object
stack traceback:
    [C]: in function 'error'
    /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:240: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:288: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:272: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:288: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:288: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:272: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:319: in function 'load'
    [string "_RESULT={torch.load('./models/openface/nn4.v1..."]:1: in main chunk
    [C]: in function 'xpcall'
    /usr/local/share/lua/5.1/trepl/init.lua:650: in function 'repl'
    /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
    [C]: at 0x00406260  

snowlord commented Nov 24, 2015

hi,@bamos,i changed on the 64-bit x86,i have checked md5 of model file.it showed different problem.

th> torch.load('./models/openface/nn4.v1.t7')
/usr/local/share/lua/5.1/torch/File.lua:294: unknown object
stack traceback:
    [C]: in function 'error'
    /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:240: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:288: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:272: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:288: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:288: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:272: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:319: in function 'load'
    [string "_RESULT={torch.load('./models/openface/nn4.v1..."]:1: in main chunk
    [C]: in function 'xpcall'
    /usr/local/share/lua/5.1/trepl/init.lua:650: in function 'repl'
    /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
    [C]: at 0x00406260  
@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Nov 24, 2015

Collaborator

Hi @snowlord - interesting you're seeing that on 64-bit x86. Somebody in this thread on the torch mailing list got a similar unknown object error and said it was an architecture issue: https://groups.google.com/forum/#!msg/torch7/zNNdXATZxlA/z5A2HocVCgAJ

Does the ascii model work on your 64-bit x86 machine?

Collaborator

bamos commented Nov 24, 2015

Hi @snowlord - interesting you're seeing that on 64-bit x86. Somebody in this thread on the torch mailing list got a similar unknown object error and said it was an architecture issue: https://groups.google.com/forum/#!msg/torch7/zNNdXATZxlA/z5A2HocVCgAJ

Does the ascii model work on your 64-bit x86 machine?

@shimen

This comment has been minimized.

Show comment
Hide comment
@shimen

shimen Dec 6, 2015

Hi @bamos ,
I got the same problem:

celeb-classifier.nn4.v1.pkl  cifar10-test.t7  cifar10torchsmall.zip  cifar10-train.t7  nn2.def.lua  nn4.def.lua  nn4.v1.ascii.t7  nn4.v1.t7
-bash-4.1# th                                                                                                                              
th> require 'nn'
{..........}
                                                                      [0.0143s]
th> require 'dpnn'                                                             
true                                                                           
                                                                      [0.0113s]
th> net = torch.load('nn4.v1.t7')                               
/usr/local/share/lua/5.1/torch/File.lua:241: Failed to load function from bytecode: (binary): cannot load incompatible bytecode
stack traceback:
        [C]: in function 'error'
        /usr/local/share/lua/5.1/torch/File.lua:241: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:325: in function 'load'
        [string "net = torch.load('nn4.v1.t7')"]:1: in main chunk
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/trepl/init.lua:668: in function 'repl'
        /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x004051e0
                                                                      [0.0005s]
th> net = torch.load('nn4.v1.ascii.t7', 'ascii')
/usr/local/share/lua/5.1/torch/File.lua:241: Failed to load function from bytecode: (binary): cannot load incompatible bytecode
stack traceback:
        [C]: in function 'error'
        /usr/local/share/lua/5.1/torch/File.lua:241: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:325: in function 'load'
        [string "net = torch.load('nn4.v1.ascii.t7', 'ascii')"]:1: in main chunk
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/trepl/init.lua:668: in function 'repl'
        /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x004051e0
                                                                      [0.0053s]
th> net = torch.load('cifar10-train.t7')
                                                                      [0.0134s]
th>

As you can see I had tried to load the model that you provided at this link:
https://groups.google.com/forum/#!msg/torch7/zNNdXATZxlA/z5A2HocVCgAJ
everything loads just fine.
net = torch.load('cifar10-train.t7')

when I tried to load the nn4.v1.t7 with no susses:
net = torch.load('nn4.v1.t7')
net = torch.load('nn4.v1.ascii.t7', 'ascii')

I had done a md5sum test:

md5sum models/{dlib/*.dat,openface/*.{pkl,t7}}
73fde5e05226548677a050913eed4e04  models/dlib/shape_predictor_68_face_landmarks.dat
c0675d57dc976df601b085f4af67ecb9  models/openface/celeb-classifier.nn4.v1.pkl
735723e2c9cc4eefc00a7df34c9a4d3b  models/openface/nn4.v1.ascii.t7
a59a5ec1938370cd401b257619848960  models/openface/nn4.v1.t7

I'm on x86_64 GNU/Linux.
What seems to be the problem?

Ilya

shimen commented Dec 6, 2015

Hi @bamos ,
I got the same problem:

celeb-classifier.nn4.v1.pkl  cifar10-test.t7  cifar10torchsmall.zip  cifar10-train.t7  nn2.def.lua  nn4.def.lua  nn4.v1.ascii.t7  nn4.v1.t7
-bash-4.1# th                                                                                                                              
th> require 'nn'
{..........}
                                                                      [0.0143s]
th> require 'dpnn'                                                             
true                                                                           
                                                                      [0.0113s]
th> net = torch.load('nn4.v1.t7')                               
/usr/local/share/lua/5.1/torch/File.lua:241: Failed to load function from bytecode: (binary): cannot load incompatible bytecode
stack traceback:
        [C]: in function 'error'
        /usr/local/share/lua/5.1/torch/File.lua:241: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:325: in function 'load'
        [string "net = torch.load('nn4.v1.t7')"]:1: in main chunk
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/trepl/init.lua:668: in function 'repl'
        /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x004051e0
                                                                      [0.0005s]
th> net = torch.load('nn4.v1.ascii.t7', 'ascii')
/usr/local/share/lua/5.1/torch/File.lua:241: Failed to load function from bytecode: (binary): cannot load incompatible bytecode
stack traceback:
        [C]: in function 'error'
        /usr/local/share/lua/5.1/torch/File.lua:241: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:325: in function 'load'
        [string "net = torch.load('nn4.v1.ascii.t7', 'ascii')"]:1: in main chunk
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/trepl/init.lua:668: in function 'repl'
        /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x004051e0
                                                                      [0.0053s]
th> net = torch.load('cifar10-train.t7')
                                                                      [0.0134s]
th>

As you can see I had tried to load the model that you provided at this link:
https://groups.google.com/forum/#!msg/torch7/zNNdXATZxlA/z5A2HocVCgAJ
everything loads just fine.
net = torch.load('cifar10-train.t7')

when I tried to load the nn4.v1.t7 with no susses:
net = torch.load('nn4.v1.t7')
net = torch.load('nn4.v1.ascii.t7', 'ascii')

I had done a md5sum test:

md5sum models/{dlib/*.dat,openface/*.{pkl,t7}}
73fde5e05226548677a050913eed4e04  models/dlib/shape_predictor_68_face_landmarks.dat
c0675d57dc976df601b085f4af67ecb9  models/openface/celeb-classifier.nn4.v1.pkl
735723e2c9cc4eefc00a7df34c9a4d3b  models/openface/nn4.v1.ascii.t7
a59a5ec1938370cd401b257619848960  models/openface/nn4.v1.t7

I'm on x86_64 GNU/Linux.
What seems to be the problem?

Ilya

@shimen

This comment has been minimized.

Show comment
Hide comment
@shimen

shimen Dec 6, 2015

It seems to be a problem of the lua and luajit versions:

-bash-4.1# lua -v
Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio
-bash-4.1# luajit -v
LuaJIT 2.0.4 -- Copyright (C) 2005-2015 Mike Pall. http://luajit.org/

I use these versions. Which version the model was complied with?

Ilya

shimen commented Dec 6, 2015

It seems to be a problem of the lua and luajit versions:

-bash-4.1# lua -v
Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio
-bash-4.1# luajit -v
LuaJIT 2.0.4 -- Copyright (C) 2005-2015 Mike Pall. http://luajit.org/

I use these versions. Which version the model was complied with?

Ilya

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Dec 6, 2015

Collaborator
Collaborator

bamos commented Dec 6, 2015

@shimen

This comment has been minimized.

Show comment
Hide comment
@shimen

shimen Dec 7, 2015

I had installed LuaJIT 2.1.0-beta1.
now the command got no errors!!!
net = torch.load('nn4.v1.t7')

download from:
https://github.com/torch/luajit-rocks

make sure to add this option to the cmake "-DWITH_LUAJIT21=ON" !!!!!!!!!!!!!

git clone https://github.com/torch/luajit-rocks.git
cd luajit-rocks
mkdir build
cd build
cmake .. -DWITH_LUAJIT21=ON

shimen commented Dec 7, 2015

I had installed LuaJIT 2.1.0-beta1.
now the command got no errors!!!
net = torch.load('nn4.v1.t7')

download from:
https://github.com/torch/luajit-rocks

make sure to add this option to the cmake "-DWITH_LUAJIT21=ON" !!!!!!!!!!!!!

git clone https://github.com/torch/luajit-rocks.git
cd luajit-rocks
mkdir build
cd build
cmake .. -DWITH_LUAJIT21=ON

@lijian8

This comment has been minimized.

Show comment
Hide comment
@lijian8

lijian8 Dec 20, 2015

Hi @bamos ,
I try to play with ARM 32 bit platform, and change the torch load model to
net = torch.load('nn4.v1.ascii.t7', 'ascii')
A strange thing is when I run the compare demo script I got following error message:

Error getting result from Torch subprocess.
Line read:

Exception:

could not convert string to float:

stdout:

stderr:

I tried to run the same code in X86_64 platform it's all OK since ascii version should be platform independent. Could you give some hint about this issue I had on ARM 32 bit platform? Thanks.

lijian8 commented Dec 20, 2015

Hi @bamos ,
I try to play with ARM 32 bit platform, and change the torch load model to
net = torch.load('nn4.v1.ascii.t7', 'ascii')
A strange thing is when I run the compare demo script I got following error message:

Error getting result from Torch subprocess.
Line read:

Exception:

could not convert string to float:

stdout:

stderr:

I tried to run the same code in X86_64 platform it's all OK since ascii version should be platform independent. Could you give some hint about this issue I had on ARM 32 bit platform? Thanks.

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Dec 20, 2015

Collaborator

Hi @lijian8,

stdout:

stderr:

Are these both empty? I would expect more content.

I tried to run the same code in X86_64 platform it's all OK since
ascii version should be platform indepedant. Could you give some
hint about this issue I had on ARM 32 bit platform? Thanks.

I don't have any experience executing on 32-bit ARM.
Maybe the Torch community will be able to help if we can
find a more informative error message.

-Brandon.

Collaborator

bamos commented Dec 20, 2015

Hi @lijian8,

stdout:

stderr:

Are these both empty? I would expect more content.

I tried to run the same code in X86_64 platform it's all OK since
ascii version should be platform indepedant. Could you give some
hint about this issue I had on ARM 32 bit platform? Thanks.

I don't have any experience executing on 32-bit ARM.
Maybe the Torch community will be able to help if we can
find a more informative error message.

-Brandon.

@lijian8

This comment has been minimized.

Show comment
Hide comment
@lijian8

lijian8 Dec 21, 2015

Hi @bamos,
Yes these are empty. I'll try to run sunprocess directly on torch to see if I can catch up something.

lijian8 commented Dec 21, 2015

Hi @bamos,
Yes these are empty. I'll try to run sunprocess directly on torch to see if I can catch up something.

@bamos bamos removed this from the v0.2.0 milestone Dec 30, 2015

@SyRenity

This comment has been minimized.

Show comment
Hide comment
@SyRenity

SyRenity Jan 21, 2016

I had a very similar issue on Jetson TK1 board, here is a solution from another project that might help:

git clone https://github.com/mvitez/torch7.git mvittorch7
cd mvittorch7
luarocks make rocks/torch-scm-1.rockspec
diff --git a/eval.lua b/eval.lua
index 1814180..8cad5ba 100644
--- a/eval.lua
+++ b/eval.lua
@@ -65,8 +65,21 @@ end
 -------------------------------------------------------------------------------
 -- Load the model checkpoint to evaluate
 -------------------------------------------------------------------------------
+local function load(filename)
+   local mode = 'binary'
+   local referenced = true
+   local file = torch.DiskFile(filename, 'r')
+   file[mode](file)
+   file:referenced(referenced)
+   file:longSize(8)
+   file:littleEndianEncoding()
+   local object = file:readObject()
+   file:close()
+   return object
+end
+
 assert(string.len(opt.model) > 0, 'must provide a model')
-local checkpoint = torch.load(opt.model)
+local checkpoint = load(opt.model)
 -- override and collect parameters
 if string.len(opt.input_h5) == 0 then opt.input_h5 = checkpoint.opt.input_h5 end
 if string.len(opt.input_json) == 0 then opt.input_json = checkpoint.opt.input_json end

SyRenity commented Jan 21, 2016

I had a very similar issue on Jetson TK1 board, here is a solution from another project that might help:

git clone https://github.com/mvitez/torch7.git mvittorch7
cd mvittorch7
luarocks make rocks/torch-scm-1.rockspec
diff --git a/eval.lua b/eval.lua
index 1814180..8cad5ba 100644
--- a/eval.lua
+++ b/eval.lua
@@ -65,8 +65,21 @@ end
 -------------------------------------------------------------------------------
 -- Load the model checkpoint to evaluate
 -------------------------------------------------------------------------------
+local function load(filename)
+   local mode = 'binary'
+   local referenced = true
+   local file = torch.DiskFile(filename, 'r')
+   file[mode](file)
+   file:referenced(referenced)
+   file:longSize(8)
+   file:littleEndianEncoding()
+   local object = file:readObject()
+   file:close()
+   return object
+end
+
 assert(string.len(opt.model) > 0, 'must provide a model')
-local checkpoint = torch.load(opt.model)
+local checkpoint = load(opt.model)
 -- override and collect parameters
 if string.len(opt.input_h5) == 0 then opt.input_h5 = checkpoint.opt.input_h5 end
 if string.len(opt.input_json) == 0 then opt.input_json = checkpoint.opt.input_json end
@jacklanchantin

This comment has been minimized.

Show comment
Hide comment
@jacklanchantin

jacklanchantin Feb 2, 2016

I had the same issue. It was fixed by the comment from SyRenity:

git clone https://github.com/mvitez/torch7.git mvittorch7
cd mvittorch7
luarocks make rocks/torch-scm-1.rockspec

jacklanchantin commented Feb 2, 2016

I had the same issue. It was fixed by the comment from SyRenity:

git clone https://github.com/mvitez/torch7.git mvittorch7
cd mvittorch7
luarocks make rocks/torch-scm-1.rockspec

@SyRenity

This comment has been minimized.

Show comment
Hide comment
@SyRenity

SyRenity commented Feb 2, 2016

@jacklanchantin glad it helped :)

@fmassa

This comment has been minimized.

Show comment
Hide comment
@fmassa

fmassa Feb 5, 2016

For information, torch/torch7#476 was merged into master some time ago, so all the changes in @mvitez branch were integrated to torch.

fmassa commented Feb 5, 2016

For information, torch/torch7#476 was merged into master some time ago, so all the changes in @mvitez branch were integrated to torch.

@lijian8

This comment has been minimized.

Show comment
Hide comment
@lijian8

lijian8 Feb 10, 2016

Thanks @bamos @SyRenity @jacklanchantin, this issue should be fixed with instruction from @SyRenity .

lijian8 commented Feb 10, 2016

Thanks @bamos @SyRenity @jacklanchantin, this issue should be fixed with instruction from @SyRenity .

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Feb 17, 2016

Collaborator

Great info, thanks all!

Collaborator

bamos commented Feb 17, 2016

Great info, thanks all!

@bamos bamos closed this Feb 17, 2016

@ChrisYang

This comment has been minimized.

Show comment
Hide comment
@ChrisYang

ChrisYang Mar 18, 2016

@SyRenity i am also working on TK1 but still get error when I load the binary model for openface. As mentioned before this issue should be fixed. Do you have any clue about my errors. Thanks:

net = torch.load('/home/ubuntu/Downloads/face/openface/models/openface/nn4.small2.v1.t7')
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: table index is nil
stack traceback:
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/nn/Module.lua:158: in function 'read'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'
[string "net = torch.load('/home/ubuntu/Downloads/face..."]:1: in main chunk
[C]: in function 'xpcall'
/home/ubuntu/torch/install/share/lua/5.1/trepl/init.lua:669: in function 'repl'
...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
[C]: at 0x0000cff9

ChrisYang commented Mar 18, 2016

@SyRenity i am also working on TK1 but still get error when I load the binary model for openface. As mentioned before this issue should be fixed. Do you have any clue about my errors. Thanks:

net = torch.load('/home/ubuntu/Downloads/face/openface/models/openface/nn4.small2.v1.t7')
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: table index is nil
stack traceback:
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/nn/Module.lua:158: in function 'read'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'
[string "net = torch.load('/home/ubuntu/Downloads/face..."]:1: in main chunk
[C]: in function 'xpcall'
/home/ubuntu/torch/install/share/lua/5.1/trepl/init.lua:669: in function 'repl'
...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
[C]: at 0x0000cff9

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Mar 18, 2016

Collaborator

Hi @ChrisYang - can you try using our ascii model from http://openface-models.storage.cmusatyalab.org/nn4.small2.v1.ascii.t7.xz? Unxz it and then use ascii mode in torch.load.

Collaborator

bamos commented Mar 18, 2016

Hi @ChrisYang - can you try using our ascii model from http://openface-models.storage.cmusatyalab.org/nn4.small2.v1.ascii.t7.xz? Unxz it and then use ascii mode in torch.load.

@ChrisYang

This comment has been minimized.

Show comment
Hide comment
@ChrisYang

ChrisYang Mar 19, 2016

@bamos thanks for your prompt reply.
Though I haven't found your ascii file, I managed to save a ascii version on a x86 machine and now I can load it from TK1.
However I face some new issues. It runs ok using cpu mode but very slowly on TK1. When I tried to call net:forward in cuda mode i got cuda runtime error 'too many resources requested for launch at xxx'. Do you have any clue how to solve this?

ChrisYang commented Mar 19, 2016

@bamos thanks for your prompt reply.
Though I haven't found your ascii file, I managed to save a ascii version on a x86 machine and now I can load it from TK1.
However I face some new issues. It runs ok using cpu mode but very slowly on TK1. When I tried to call net:forward in cuda mode i got cuda runtime error 'too many resources requested for launch at xxx'. Do you have any clue how to solve this?

@apeterswu

This comment has been minimized.

Show comment
Hide comment
@apeterswu

apeterswu Jun 3, 2016

@shimen Hi, I have the same problem as you, "File.lua failed to load function from bytecode binary string: not a precompiled chunk", and I also updated my luajit version to be 2.1 beta, but it still failed, I don't what to do now? Could anyone help? Thanks.

apeterswu commented Jun 3, 2016

@shimen Hi, I have the same problem as you, "File.lua failed to load function from bytecode binary string: not a precompiled chunk", and I also updated my luajit version to be 2.1 beta, but it still failed, I don't what to do now? Could anyone help? Thanks.

@shimen

This comment has been minimized.

Show comment
Hide comment
@shimen

shimen Jun 5, 2016

@apeterswu Hi, I'm not sure what is the problem. Since openFace version 0.2 I do not have to use this command.

shimen commented Jun 5, 2016

@apeterswu Hi, I'm not sure what is the problem. Since openFace version 0.2 I do not have to use this command.

@weiqifa0

This comment has been minimized.

Show comment
Hide comment
@weiqifa0

weiqifa0 Jul 10, 2016

Subject: openface
root@tegra-ubuntu:~/openface/openface# ./demos/compare.py images/examples/{lennon_,clapton_}
/home/ubuntu/torch/install/bin/luajit: /home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: table index is nil
stack traceback:
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/nn/Module.lua:158: in function 'read'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'
...lib/python2.7/dist-packages/openface/openface_server.lua:46: in main chunk
[C]: in function 'dofile'
...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x0000cff9
Traceback (most recent call last):
File "./demos/compare.py", line 101, in
d = getRep(img1) - getRep(img2)
File "./demos/compare.py", line 92, in getRep
rep = net.forward(alignedFace)
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 156, in forward
rep = self.forwardPath(t)
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 113, in forwardPath
""".format(self.cmd, self.p.stdout.read()))
Exception:

OpenFace: openface_server.lua subprocess has died.

  • Is the Torch command th on your PATH? Check with which th.
  • If th is on your PATH, try running ./util/profile-network.lua
    to see if Torch can correctly load and run the network.
    • If this gives illegal instruction errors, see the section on
      this in our FAQ at http://cmusatyalab.github.io/openface/faq/
    • In Docker, use a Bash login shell or source
      /root/torch/install/bin/torch-activate for the Torch environment.
  • See this GitHub issue if you are running on a non-64-bit machine:
    #42
  • Please post further issues to our mailing list at
    https://groups.google.com/forum/#!forum/cmu-openface
    Diagnostic information:
    cmd: ['/usr/bin/env', 'th', '/usr/local/lib/python2.7/dist-packages/openface/openface_server.lua', '-model', '/home/ubuntu/openface/openface/demos/../models/openface/nn4.small2.v1.t7', '-imgDim', '96']

    stdout:

is anyone encountered such a problem?

my email is 329410527@qq.com

Thank you very much.

weiqifa0 commented Jul 10, 2016

Subject: openface
root@tegra-ubuntu:~/openface/openface# ./demos/compare.py images/examples/{lennon_,clapton_}
/home/ubuntu/torch/install/bin/luajit: /home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: table index is nil
stack traceback:
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/nn/Module.lua:158: in function 'read'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'
...lib/python2.7/dist-packages/openface/openface_server.lua:46: in main chunk
[C]: in function 'dofile'
...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x0000cff9
Traceback (most recent call last):
File "./demos/compare.py", line 101, in
d = getRep(img1) - getRep(img2)
File "./demos/compare.py", line 92, in getRep
rep = net.forward(alignedFace)
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 156, in forward
rep = self.forwardPath(t)
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 113, in forwardPath
""".format(self.cmd, self.p.stdout.read()))
Exception:

OpenFace: openface_server.lua subprocess has died.

  • Is the Torch command th on your PATH? Check with which th.
  • If th is on your PATH, try running ./util/profile-network.lua
    to see if Torch can correctly load and run the network.
    • If this gives illegal instruction errors, see the section on
      this in our FAQ at http://cmusatyalab.github.io/openface/faq/
    • In Docker, use a Bash login shell or source
      /root/torch/install/bin/torch-activate for the Torch environment.
  • See this GitHub issue if you are running on a non-64-bit machine:
    #42
  • Please post further issues to our mailing list at
    https://groups.google.com/forum/#!forum/cmu-openface
    Diagnostic information:
    cmd: ['/usr/bin/env', 'th', '/usr/local/lib/python2.7/dist-packages/openface/openface_server.lua', '-model', '/home/ubuntu/openface/openface/demos/../models/openface/nn4.small2.v1.t7', '-imgDim', '96']

    stdout:

is anyone encountered such a problem?

my email is 329410527@qq.com

Thank you very much.

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos
Collaborator

bamos commented Jul 14, 2016

@bamos

This comment has been minimized.

Show comment
Hide comment
@bamos

bamos Jul 19, 2016

Collaborator

Some users following this issue may also be interested in helping improve dlib and its face detector's speed on ARM by adding NEON instructions. Contact @davisking if interested. Here is his comment from another thread:

NEON instructions are similar enough in overall structure that you should
be able to implement alternative versions of the simd classes in dlib (e.g.
https://github.com/davisking/dlib/blob/master/dlib/simd/simd8f.h). All the
simd usage is through these classes, so if there were NEON versions of them
then things would be much faster on ARM. I've had this on my todo list for
a long time but haven't gotten around to it yet. You should give it a go :)

Collaborator

bamos commented Jul 19, 2016

Some users following this issue may also be interested in helping improve dlib and its face detector's speed on ARM by adding NEON instructions. Contact @davisking if interested. Here is his comment from another thread:

NEON instructions are similar enough in overall structure that you should
be able to implement alternative versions of the simd classes in dlib (e.g.
https://github.com/davisking/dlib/blob/master/dlib/simd/simd8f.h). All the
simd usage is through these classes, so if there were NEON versions of them
then things would be much faster on ARM. I've had this on my todo list for
a long time but haven't gotten around to it yet. You should give it a go :)

@maxisme

This comment has been minimized.

Show comment
Hide comment
@maxisme

maxisme Aug 2, 2016

@bamos do you know of anyone successfully getting this to work on a raspberry pi? As it is driving me crazy. I have got this working. (it takes 18 seconds?!). I change this line to the ascii file and then try run again but I get a File.lua:375: unknown object error. Any ideas?

maxisme commented Aug 2, 2016

@bamos do you know of anyone successfully getting this to work on a raspberry pi? As it is driving me crazy. I have got this working. (it takes 18 seconds?!). I change this line to the ascii file and then try run again but I get a File.lua:375: unknown object error. Any ideas?

@nitish11

This comment has been minimized.

Show comment
Hide comment
@nitish11

nitish11 Aug 23, 2016

@maxisme : I got it solved. Refer issue

nitish11 commented Aug 23, 2016

@maxisme : I got it solved. Refer issue

@BrandonJoffe

This comment has been minimized.

Show comment
Hide comment
@BrandonJoffe

BrandonJoffe Nov 2, 2016

Hey @bamos,

I've been trying to use the Docker container in Ubuntu 14.04 on 64 bit x86 architecture. I have switched to the ascii model and I'm getting the same error as weiqifa0 above. I'm not quite sure where to go from here other than performing a fresh by hand install of Openface, which I want to avoid. Any suggestions would be great!

Exception in thread frame_process_thread_0:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(_self.__args, *_self.__kwargs)
File "/host/system/SurveillanceSystem.py", line 534, in process_frame
predictions, alignedFace = self.recogniser.make_prediction(personimg,face_bb)
File "/host/system/FaceRecogniser.py", line 111, in make_prediction
persondict = self.recognize_face(alignedFace)
File "/host/system/FaceRecogniser.py", line 121, in recognize_face
if self.getRep(img) is None:
File "/host/system/FaceRecogniser.py", line 145, in getRep
rep = self.net.forward(alignedFace) # Gets embedding - 128 measurements
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 156, in forward
rep = self.forwardPath(t)
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 113, in forwardPath
""".format(self.cmd, self.p.stdout.read()))
Exception:

OpenFace: openface_server.lua subprocess has died.

  • Is the Torch command th on your PATH? Check with which th.
  • If th is on your PATH, try running ./util/profile-network.lua
    to see if Torch can correctly load and run the network.
    • If this gives illegal instruction errors, see the section on
      this in our FAQ at http://cmusatyalab.github.io/openface/faq/
    • In Docker, use a Bash login shell or source
      /root/torch/install/bin/torch-activate for the Torch environment.
  • See this GitHub issue if you are running on a non-64-bit machine:
    #42
  • Please post further issues to our mailing list at
    https://groups.google.com/forum/#!forum/cmu-openface

Diagnostic information:

cmd: ['/usr/bin/env', 'th', '/usr/local/lib/python2.7/dist-packages/openface/openface_server.lua', '-model', '/host/system/../models/openface/nn4.small2.v1.ascii.t7', '-imgDim', '96']

BrandonJoffe commented Nov 2, 2016

Hey @bamos,

I've been trying to use the Docker container in Ubuntu 14.04 on 64 bit x86 architecture. I have switched to the ascii model and I'm getting the same error as weiqifa0 above. I'm not quite sure where to go from here other than performing a fresh by hand install of Openface, which I want to avoid. Any suggestions would be great!

Exception in thread frame_process_thread_0:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(_self.__args, *_self.__kwargs)
File "/host/system/SurveillanceSystem.py", line 534, in process_frame
predictions, alignedFace = self.recogniser.make_prediction(personimg,face_bb)
File "/host/system/FaceRecogniser.py", line 111, in make_prediction
persondict = self.recognize_face(alignedFace)
File "/host/system/FaceRecogniser.py", line 121, in recognize_face
if self.getRep(img) is None:
File "/host/system/FaceRecogniser.py", line 145, in getRep
rep = self.net.forward(alignedFace) # Gets embedding - 128 measurements
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 156, in forward
rep = self.forwardPath(t)
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 113, in forwardPath
""".format(self.cmd, self.p.stdout.read()))
Exception:

OpenFace: openface_server.lua subprocess has died.

  • Is the Torch command th on your PATH? Check with which th.
  • If th is on your PATH, try running ./util/profile-network.lua
    to see if Torch can correctly load and run the network.
    • If this gives illegal instruction errors, see the section on
      this in our FAQ at http://cmusatyalab.github.io/openface/faq/
    • In Docker, use a Bash login shell or source
      /root/torch/install/bin/torch-activate for the Torch environment.
  • See this GitHub issue if you are running on a non-64-bit machine:
    #42
  • Please post further issues to our mailing list at
    https://groups.google.com/forum/#!forum/cmu-openface

Diagnostic information:

cmd: ['/usr/bin/env', 'th', '/usr/local/lib/python2.7/dist-packages/openface/openface_server.lua', '-model', '/host/system/../models/openface/nn4.small2.v1.ascii.t7', '-imgDim', '96']

@BrandonJoffe

This comment has been minimized.

Show comment
Hide comment
@BrandonJoffe

BrandonJoffe Nov 2, 2016

Don't worry just tested with Docker in a Ubuntu VM and worked perfectly :) not sure what the issue was.

BrandonJoffe commented Nov 2, 2016

Don't worry just tested with Docker in a Ubuntu VM and worked perfectly :) not sure what the issue was.

@KGOURAV

This comment has been minimized.

Show comment
Hide comment
@KGOURAV

KGOURAV Apr 2, 2017

hey @bamos,
as you have said to save the model in ascii format i have saved it and i have tried these commands they are perfectly working

$ th
th> require 'nn'
th> require 'dpnn'
th> net = torch.load('nn4.v1.ascii.t7', 'ascii')

but again when i try this `command ./demos/classifier.py infer ./generated-embeddings/classifier.pkl your_test_image.jpg

this is the error i am getting

/home/pi/torch/install/share/lua/5.1/torch/File.lua:375: unknown object
stack traceback:
[C]: in function 'error'
/home/pi/torch/install/share/lua/5.1/torch/File.lua:375: in function 'readObject'
/home/pi/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'

./batch-represent/main.lua:33: in main chunk
[C]: in function 'dofile'
...e/pi/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00014fa8

KGOURAV commented Apr 2, 2017

hey @bamos,
as you have said to save the model in ascii format i have saved it and i have tried these commands they are perfectly working

$ th
th> require 'nn'
th> require 'dpnn'
th> net = torch.load('nn4.v1.ascii.t7', 'ascii')

but again when i try this `command ./demos/classifier.py infer ./generated-embeddings/classifier.pkl your_test_image.jpg

this is the error i am getting

/home/pi/torch/install/share/lua/5.1/torch/File.lua:375: unknown object
stack traceback:
[C]: in function 'error'
/home/pi/torch/install/share/lua/5.1/torch/File.lua:375: in function 'readObject'
/home/pi/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'

./batch-represent/main.lua:33: in main chunk
[C]: in function 'dofile'
...e/pi/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00014fa8
@mattanimation

This comment has been minimized.

Show comment
Hide comment
@mattanimation

mattanimation May 2, 2017

Had the same issue on ubuntu 16.04 | torch7. The ascii loading method worked with the provided ascii model download link. Just had to modify the ./batch-represent/opt.lua and main.lua file that the model gets loaded from in the example on the openface website for testing classification. However trying to run the ./demo/compare.py example that uses the openface python api suffers the same error. It seems if the torch_neural_net.py file's cmd could accept an ascii option it might be a way to curtail it?
self.cmd = ['/usr/bin/env', 'th', os.path.join(myDir, 'openface_server.lua'), '-model', model, '-imgDim', str(imgDim)]

-- update
I also modified the torch_neural_net.py and openface_server.lua to include the ascii argument and it indeed works as well.

mattanimation commented May 2, 2017

Had the same issue on ubuntu 16.04 | torch7. The ascii loading method worked with the provided ascii model download link. Just had to modify the ./batch-represent/opt.lua and main.lua file that the model gets loaded from in the example on the openface website for testing classification. However trying to run the ./demo/compare.py example that uses the openface python api suffers the same error. It seems if the torch_neural_net.py file's cmd could accept an ascii option it might be a way to curtail it?
self.cmd = ['/usr/bin/env', 'th', os.path.join(myDir, 'openface_server.lua'), '-model', model, '-imgDim', str(imgDim)]

-- update
I also modified the torch_neural_net.py and openface_server.lua to include the ascii argument and it indeed works as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment