Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read data from hdfs #1

Closed
formath opened this issue Aug 31, 2016 · 2 comments
Closed

read data from hdfs #1

formath opened this issue Aug 31, 2016 · 2 comments
Assignees

Comments

@formath
Copy link

formath commented Aug 31, 2016

"Different node should owns different parts of all Train data. This simple script did not do this job, so you should prepare it at last. " I saw this in cluster training wiki. So, could paddle read data from hdfs and distribute data to each node automatically?

@reyoung reyoung self-assigned this Aug 31, 2016
@reyoung
Copy link
Collaborator

reyoung commented Aug 31, 2016

Distribute data to cluster is not added in PaddlePaddle now. You can read data directly from a HDFS file path by PyDataProvider2.

PaddlePaddle not handle how to get data file remotely, just pass the file path into a Python function. It is user's job to OPEN the file (or SQL connection string, or HDFS path), and get each
sample one by one from it.

It is welcome to contribute a script to distribute data to cluster. Or we may add it soon if this feature is very necessary.

qingqing01 pushed a commit that referenced this issue Sep 14, 2016
Update from the original
reyoung referenced this issue in reyoung/Paddle Sep 21, 2016
wangkuiyi added a commit that referenced this issue Dec 1, 2016
reyoung pushed a commit that referenced this issue Dec 5, 2016
luotao1 pushed a commit that referenced this issue Dec 28, 2016
backyes pushed a commit that referenced this issue Dec 30, 2016
Rephrase the first paragraph
qingqing01 pushed a commit that referenced this issue Mar 20, 2017
reyoung pushed a commit that referenced this issue Sep 5, 2017
Invoke check_grad many times for no_grad_set
zchen0211 added a commit that referenced this issue Sep 14, 2017
qingqing01 pushed a commit that referenced this issue Sep 19, 2017
luotao1 pushed a commit that referenced this issue Oct 16, 2017
wanghaox added a commit that referenced this issue Nov 24, 2017
typhoonzero added a commit that referenced this issue Dec 2, 2017
PeiyuLau pushed a commit to PeiyuLau/Paddle that referenced this issue Jun 8, 2023
feifei-111 referenced this issue in feifei-111/Paddle Jun 14, 2023
@paddle-bot paddle-bot bot added the status/developing 开发中 label Sep 22, 2023
@paddle-bot paddle-bot bot reopened this Sep 22, 2023
tianyan01 pushed a commit to tianyan01/Paddle that referenced this issue Nov 23, 2023
lizexu123 referenced this issue in lizexu123/Paddle Feb 23, 2024
Rename docs-src to docs and rename demo to tutorials.
hanhaowen-mt referenced this issue in hanhaowen-mt/Paddle Feb 29, 2024
[MTAI-489] build(ci): test CI
NKNaN pushed a commit to NKNaN/Paddle that referenced this issue Mar 3, 2024
Correct license in rockspec file.
Fridge003 pushed a commit to Fridge003/Paddle that referenced this issue Mar 20, 2024
kircle888 pushed a commit to kircle888/Paddle that referenced this issue Jul 7, 2024
* provide capi for flash attention

* cuda enforce; flash error

* fix api

* fix zero tensors

* fix softmax_ptr size
kircle888 added a commit to kircle888/Paddle that referenced this issue Jul 7, 2024
ckl117 pushed a commit to ckl117/Paddle that referenced this issue Jul 18, 2024
Layssy pushed a commit to Layssy/Paddle that referenced this issue Jul 24, 2024
@paddle-bot paddle-bot bot closed this as completed Sep 24, 2024
Copy link

paddle-bot bot commented Sep 26, 2024

Since you haven't replied for more than a year, we have closed this issue/pr.
If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up.
由于您超过一年未回复,我们将关闭这个issue/pr。
若问题未解决或有后续问题,请随时重新打开,我们会继续跟进。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants