Skip to content

Comments

Update quickstart.md#270

Closed
findinpath wants to merge 1 commit intoapache:mainfrom
findinpath:patch-1
Closed

Update quickstart.md#270
findinpath wants to merge 1 commit intoapache:mainfrom
findinpath:patch-1

Conversation

@findinpath
Copy link

Update quickstart.md to contain explicitly all the necessary commands needed to setup the accumulo environment on a single node.
When starting only tserver, some of the commands (e.g. : createtable mytable )executed on the accumulo shell will hang undefinitely

Update quickstart.md to contain explicitly all the necessary commands needed to setup the accumulo environment on a single node.
When starting only tserver, some of the commands (e.g. : `createtable mytable` )executed on the accumulo shell will hang undefinitely

accumulo-service master start
accumulo-service tserver start
accumulo-service monitor start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the accumulo-cluster cmd to start all of the services either on a single node or on multiple nodes - these commands are only necessary if you elect to run each manually (or say you need to restart just one) Maybe it would be more user-friendly if the accumulo-cluster command section came before this section?

Also, for a complete installation I believe that you might be missing necessary services (gc). Then maybe this could be
accumulo [service-name] or accumulo-service [service-name] start | stop with a more complete list of the required service names?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EdColeman i'm a newbie with accumulo and while trying to create a table i simply didn't know why it hangs.
I've spent more than an hour debugging and pondering about it, and afterwards I came to see the fact that there are more services to start, not only the tserver.

Obviously my modification is quite naive because there are probably more services. So what you are suggesting with accumulo [service-name] fits much better.

The quick start for me wasn't so quick :)

I came across the sample code https://github.com/apache/accumulo-examples/blob/main/docs/sample.md which gives a few pointers on how to work with the accumulo shell - I find it quite good for a quick start - to get a feeling about what accumulo actually does.

Copy link
Contributor

@EdColeman EdColeman Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs start with consider using fluo-uno - that's the easiest on ramp that I know of. Is there a reason that you did not try it? Would changing / adding wording there have made you more likely to use fluo-uno?

Moving the cluster command before the individual commands might have made your experience easier - with accumulo-cluster start, things should have been started and removes the beginner from needing five commands instead of one.

But with that, uno fetch accumulo, uno start accumulo is way easier than setting up accumulo from scratch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit unrelated, but another small issue I've encountered was the ClassNotFoundError regarding zookeeper's KeeperException class.

Cause of it :
accumulo-env.sh

 
 
 
 CLASSPATH="${CLASSPATH}:${lib}/*:${HADOOP_CONF_DIR}:${ZOOKEEPER_HOME}/*:${HADOOP_HOME}/share/hadoop/client/*"

replaced it with

 
 
 
 CLASSPATH="${CLASSPATH}:${lib}/*:${HADOOP_CONF_DIR}:${ZOOKEEPER_HOME}/lib/*:${HADOOP_HOME}/share/hadoop/client/*"

(added lib after ZOOKEEPER_HOME)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But with that, uno fetch accumulo, uno start accumulo is way easier than setting up accumulo from scratch.

I'm looking now through the code of uno and see that it downloads everything what is needed. I am sorry, but due to the fact that I have already zookeeper and hadoop on my machine I thought to opt to simply install the bin archive of accumulo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the zookeeper issue we'd need to know the zookeeper and accumlo versions. There were changes in ZooKeeper 3.x series that modified where zookeeper store jars and separated the jute (zk comms layer) into a separate jar. That's one benefit of uno - it should download compatible versions of things - it also allows you to point at you local reop and build / run accumulo with your changes if you think you might go down that route in the future.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried with both apache zookeeper 3.5.9 and zookeeper 3.7.0
In both of them, the .jar libraries are located under ZOOKEEPER_HOME/lib directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What version of Accumulo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since that is unrelated, let's not bog this issue down with that issue. See apache/accumulo#1530 for more.

Comment on lines 138 to +142
Start Accumulo processes (tserver, master, monitor, etc) using command below:

accumulo master
accumulo tserver
accumulo monitor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was an example explaining how to start an individual service, generally. Rather than add the other services as an example, it would probably be better to reword the instructions to make it clear that this is just one example.

(similar comment below)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some suggested improvements in #271 . Please take a look and see if they help alleviate the concerns you were trying to address in this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me, but now I am bit biased because I read in the meantime more about the architecture of accumulo and know now a bit how to use it.

I'd highlight which of the services need to run on the local installation at the very least. I don't run for example GC and it's still fine. You may say that I should have used Uno, but maybe some other users will try their luck also by doing directly a local installation of accumulo.

Also important is a link to the wonderful accumulo-examples repository. This is a gem containing quite useful stuff to get started with accumulo.

Troubleshooting is also quite welcome. Without troubleshooting tips some of the users will stop early after trying to setup accumulo.

@findinpath
Copy link
Author

findinpath commented Apr 8, 2021 via email

@EdColeman
Copy link
Contributor

The pom for 2.0.1 specifies zookeeper 3.4.14.

For the next release (main branch), the pom currently has ZooKeeper version 3.5.9 and it looks like the accumulo-env.sh has been updated with the zookeeper jar locations ($ZOOKEEPER_HOME/lib/) for later zookeeper releases.

@findinpath findinpath closed this Apr 8, 2021
@findinpath
Copy link
Author

Thank you @EdColeman and @ctubbsii for taking the time to talk with me on the quick start documentation topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants