Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Error For TWITTER-US Dataset #9

Closed
MortonWang opened this issue Apr 7, 2019 · 3 comments
Closed

Memory Error For TWITTER-US Dataset #9

MortonWang opened this issue Apr 7, 2019 · 3 comments

Comments

@MortonWang
Copy link

Hi, Tiiiger

I am very interesting in your recent SGC work.

I want to apply your SGC code to the Semi-supervised user geolocation which belongs to Downstream Tasks in your paper.

The GEOTEXT dataset is OK, but when I turn to TWITTER-US and TWITTER-WORLD, it crashed.The error is list as follows:

File "/home/wtl/桌面/wtlCode/geoSGC/dataProcess.py", line 96, in process_data
features = torch.FloatTensor(features.to_dense())
RuntimeError: $ Torch: not enough memory: you tried to allocate 417GB. Buy new RAM! at /pytorch/aten/src/TH/THGeneral.cpp:201

I have tried different versions of python and torch, such as python 2.7 + torch 1.0.1.post2 and python 3.5 + torch 1.0.1.post2 , but failed. I also google for solution but so many methods dose not work.

Do you have the similar error, and how do you fix it? My computer is Ubuntu 16.04 with 40GB memory.

Many thanks for your help.

-Morton

@Tiiiger
Copy link
Owner

Tiiiger commented Apr 8, 2019

@felixgwu

@felixgwu
Copy link
Collaborator

felixgwu commented Apr 8, 2019

Hi Morton,

The reason is that the TWITTER-US and TWITTER-WORLD have high-dimensional sparse features.
Converting these features to a dense tensor would require too much memory.
In our experiments, we keep it as a sparse Tensor in our experiments. We use Afshin's code base written in Theano, so we didn't implement it in PyTorch; however, you may consider converting the input features to a torch.sparse.FloatTensor.
BTW, we don't pre-compute the features on these datasets. Instead, based on the associative property we multiply the node features with the weight matrix first and then do K-step propagation to reduce memory usage.

-Felix

@Tiiiger
Copy link
Owner

Tiiiger commented Apr 9, 2019

@MortonWang I am closing this if you don't have any further question. feel free to reopen.

@Tiiiger Tiiiger closed this as completed Apr 9, 2019
@felixgwu felixgwu mentioned this issue Jun 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants