Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with accumulo #32

Open
madhvi-gupta opened this issue Aug 5, 2015 · 3 comments
Open

Integration with accumulo #32

madhvi-gupta opened this issue Aug 5, 2015 · 3 comments

Comments

@madhvi-gupta
Copy link

How accumulo can be made a data source for distributedR so that analytics can be done over that data parallely?

@fun-indra
Copy link
Contributor

Hi Madhvi,
The issue with running distributedR with accumulo is that you need a connector to read data from accumulo to R. We have neither created nor tested any data loaders for accumulo. You are welcome to search for other open source R-accumulo connectors. A quick search shows the the following https://github.com/DataTacticsCorp/raccumulo (though I have no idea whether it works or not).

We will soon release a HDFS connector. It will help you load data directly from HDFS and run distributedR applications.

@madhvi-gupta
Copy link
Author

On Thursday 06 August 2015 02:02 AM, IndraR wrote:

Hi Madhvi,
The issue with running distributedR with accumulo is that you need a
connector to read data from accumulo to R. We have neither created nor
tested any data loaders for accumulo. You are welcome to search for
other open source R-accumulo connectors. A quick search shows the the
following https://github.com/DataTacticsCorp/raccumulo (though I have
no idea whether it works or not).

We will soon release a HDFS connector. It will help you load data
directly from HDFS and run distributedR applications.


Reply to this email directly or view it on GitHub
#32 (comment).

Hi Indra,

I am currently trying to use raccumulo(github link you shared) for
loading data in distributedR but it's not working as required.It is not
providing the whole data to be loaded in R.

Thanks and Regards
Madhvi Gupta

@fun-indra
Copy link
Contributor

As I mentioned, we have not tried or tested any accumulo connectors. Still, are you able to load data in a single R session (not distributedR) using that connector? What is the code that you used with distributedR? What is the error? How much data is getting loaded?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants